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APOPTOSIS MODULATORS THAT INTERACT WITH THE 
HUNTINGTON'S DISEASE GENE 

BACKGROUND OF THE INVENTION 

This application relates to a family of apoptosis modulators that interact with the 
Huntington's Disease gene product, and to methods and compositions relating thereto. 

"Interacting proteins" are proteins which associate in vivo to form specific complexes. 
5 Non-covalent bonds, including hydrogen bonds, hydrophobic interactions and other 

molecular associations fomi between the proteins when two protein surfaces are matched or 
have affinity for each other. This affinity or match is required for the recognition of the two 
proteins, and the formation of an interaction. Protein-protein interactions are involved in the 
assembly of enzyme subunits; in antigen-antibody reactions; in forming the supramolccular 
10 structures of ribosomes, filaments, and viruses; in transport; and in the interaction of 
receptors on a cell with growth factors and hormones. 

Huntington's disease is an aduh onset disorder characterized by selective neuronal loss 
in discrete regions of the brain and spinal chord that lead to progressive movement disorder, 
personality change and intellectual decline. From onset, which generally occurs around age 
15 40, the disease progresses with worsening symptoms, ending in death approximately 18 years 
after onset. 

The biochemical cause of Huntington's disease is unclear. While the biochemical 
cause of Huntington's disease has remained elusive, a mutation in a gene within chromosome 
4pl63 subband has been identified and hnked to the disease. This gene, referred to as the 

20 Huntington's Disease or HD gene, contains two repeat regions, a CAG repeat region and a 

CCG repeat region. Testing of Hunrington's disease patients has shown that the CAG region 
is highly polymorphic, and that the number of CAG repeat units in the CAG repeat region is a 
very reliable indicator of having inherited the gene for Huntington's disease. Thus, in control 
individuals and in most individuals suffering from neuropsychiatric disorders other than 

25 Huntington's disease, the number of CAG repeats is between 9 and 35, while in individuals 
suffering from Huntington's disease the number of CAG repeals is expanded and is 36 or 
greater. 
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To date, no differences have been observed at either the total RNA, mRNA or protein 
levels between normal and HD-affected individuals. Thus, the function of the HD protein 
and its role in the pathogenesis of Huntington's Disease remain to be elucidated. 

5 SUMMARY OF THE INVENTION 

We have now identified a protein designated as HIPl , that interact differently with 
the gene product of a normal (16 CAG repeat) and an expanded (>44 CAG repeat) HD gene. 
The HIPl protein originally isolated from a yeast two-hybrid screen is encoded by a 1.2 kb 
cDNA (Seq. ID. No. 1), devoid of stop codons, that is expressed as a 400 amino acid 
10 polypeptide (Seq. ID. No, 2). Subsequent study has elucidated additional sequence for HIPl 
such that a 1090 amino acid protein is now known. (Seq. ID No. 5). Expression of the HIPl 
protein was found to be enriched in the brain. 

Analysis of the sequence of the HIPl protein indicated that it includes a death effector 
domain (DED), suggesting an apoptotic function. Thus, it appears that a normal function of 
15 huntingtin may be to bind HIPl and related apoptosis modulators, reducing its effectiveness 
in stimulating cell death. Since expanded huntingtin performs this function less well, there is 
an increase in HIPl -modulated cell death in individuals with an expanded repeat in the HD 
gene. Furthermore, additional members of the same family of proteins have been identified 
which also contain a DED. Thus, the present invention provides a new class of apoptotic 
20 modulators which are referred to as HIP-apoptosis modulating proteins. 

This understanding of the likely role of huntingtin and HIPl or related proteins in the 
pathology of Huntington's Disease offers several possibilities for therapy. First, because the 
function of huntingtin apparently depends at least in part on the ability to interact with HIP- 
apoptosis modulating proteins, added expression (e.g., via gene therapy) of normal (non- 
25 expanded) huntingtin or Of the HIP-binding region of huntingtin should provide a therapeutic 
benefit. Other DED-interacting peptides could also be used to mask and reduce the 
interaction of HIP-apoptosis modulating proteins with the death signaling complex. 
Alternatively, a mutant form of HlP-protein from which the DED has been deleted might be 
introduced, for example using gene therapy techniques. Because HIP-apoptosis modulating 
30 proteins have been shown to self-associate, a protein with a deleted DED may compete with 
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endogenous HTP-protein in the formation of these associations, thereby reducing the amount 
of apoptotically-active HlP-protein. 

BRIEF DESCRIPTION OF THE DRAWING 
5 Fig. 1 graphically depicts the amount of interaction between HIPl and Huntingtin 

proteins with varying lengths of polyglutamine repeat; 

Fig. 2 compares the nucleic acid sequences of human and murine HIPl and HIP la; 
Fig. 3 compares the amino acid sequences of human and murine HIPl and HIP la; 
Fig. 4 shows the sequences of various death effector domains in comparison to the 
10 DED of human and murine HIPl and HIP I a; 

Fig. 5 shows the genomic organization of human HIPl; 

Fig. 6 compares the sequences of human HIPl with ZK370.3 protein of C elegans; 
Fig. 7 shows mouse EST's with homology to human HIPl cDNA used to screen a 
mouse brain library; 

15 Fig. 8 shows the affect of HIPl on susceptibility of cells to stress; and 

Figs. 9 A - 9C show the toxicity of HIPl in the presence of huntingtin with different 
lengths of polyglutamine repeats. 

DETAILED DESCRIPTION OF THE INVENTION 

20 This application relates to a new family of proteins function as modulators of apop- 

tosis. At least some of these proteins, notably the human protein designated HIPl, interact 
with the gene product of the Huntington's disease gene. Other proteins within the family 
possess at least 40% and preferably more than 50% nucleotide identity with HIPl and include 
a death effector domain (DED) . Such proteins are referred to in the specification and claims 

25 hereof as "HIP-apoptosis modulating proteins." 

The first HIP-apoptosis modulating protein identified was designated as HIPl . HIPl 
was identified using the yeast two-hybrid system described in US Patent No. 5,283,173 which 
is incorporated herein by reference. Briefly, this system utilizes two chimeric genes or 
plasmids expressible in a yeast host. The yeast host is selected to contain a detectable marker 

30 gene having a binding site for the DNA binding domain of a transcriptional activator. The 
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first chimeric gene or plasmid encodes a DNA-binding domain which recognizes the binding 
site of the selectable maricer gene and a test protein or protein fragment. The second chimeric 
gene or plasmid encodes for a second test protein and a transcriptional activation domain. 
The two chimeric genes or plasmids are introduced into the host cell and expressed, and the 
5 cells are cultivated. Expression of the detectable marker gene only occurs when the gene 
product of the first chimeric gene or plasmid binds to the DNA binding domain of the 
detectable marker gene, and a transcriptional activation domain is brought into sufficient 
proximity to the DNA-binding domain, an occurrence which is facilitated by protein-protein 
interactions between the first and second test proteins. By selecting for cells expressing the 

10 detectable marker gene, those cells which contain chimeric genes or plasmids for interacting 
proteins can be identified, and the gene can be recovered and identified. 

In testing for Huntington Interacting Proteins, several different plasmids were 
prepared containing portions of the human HD gene. The first four, identified as l6pGBT9, 
44pGBT9, 80pGBT9 and 128pGBT9, were GAL4 DNA binding domain-HD in-fi-ame 

15 fusions containing nucleotides 314 to 1955 (amino acids 1-540) of the pubfished HD cDNA 
sequences cloned into the vector pGBT9 (Clontech). These plasmids contain a CAG repeat 
region of 16, 44, 80 and 128 glutamine-encoding repeats, respectively. A clone (DMK 
BamHIpGBT9) was made by fusing a cDNA encoding the first 544 amino acids of the 
myotonic dystrophy gene (a gif^ fi-om R. Komeluk) in-fi-ame with the GAL4-DNA BD of 

20 pGBT9 and was used as a negative control 

These plasmids have been used to idenfify and characterize HIPl , as well as two 
additional HD-interacting proteins, HIP2 and H1P3, which have not yet been tested for 
function as apoptosis modulators. These plasmids can be further used for the identification of 
additional interacting proteins which do act as apoptosis modulators, and for tests to refine 

25 the region on the protein in which the interaction occurs. Thus, one aspect of the invention is 
these four plasmids, and the use of these plasmids in identifying HD-interacting proteins. 
Furthermore, it will be appreciated that the GAM DNA-binding and activating domains are 
not the only domains which can be used in the yeast two-hybrid assay. Thus, in a broader 
sense, the invention encompasses any chimeric genes or plasmids containing nucleotides 314 

30 to 1 955 of the HD gene together with an activating or DNA-binding domain suitable for use 
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in the yeast one, two- or three-hybrid assay for proteins critical in either binding to the HD 
protein or responsible for regulated expression of the HD gene. 

After introducing the plasmids into Y190 yeast host cells, transforming the host cells 
with an adult human brain Matchmaker™ (Clontech) cDNA library coupled with a GAM 
5 activating domain, and selecting for the expression of two detectable marker genes to identify 
clones containing genes for interacting proteins, the activating domain plasmids were 
recovered and analyzed. As a result of this analysis, three different cDNA fragments were 
identified as encoding for HD-interacting proteins and designated as HIPl, HIP2 and HIP3. 
The nucleic acid sequence of HIPl, as originally recovered in the yeast two-hybrid assay, is 

10 given in Seq. DD. No 1 . The polypeptide which it encodes is given by Scq. ID No. 2. Further 
investigation of the HIPl cDNA resulted in the characterization of a longer region of cDNA 
totaling 4795 bases and a coiresponding protein, the sequences of which are given by Seq ID 
Nos. 3 and 4, respectively. A further portion of the HIPl protein was characterized, 
extending the length to the complete protein sequence of 1090 amino acids (Seq. ID No. 5) 

15 The cDNA molecules encoding HIP-apoptosis modulating proteins, particularly those 

encoding portions of HIPl, can be explored using oligonucleotide probes for example for 
amplification and sequencing. In addition, oligonucleotide probes complementary to the 
cDNA can be used as diagnostic probes to localize and quantify the presence of HIPl DNA. 
Probes of this type with a one or two base mismatch can also be used in site-directed 

20 mutagenesis to introduce variations into the HIPl sequence which may increase or decrease 
the apoptotic activity. Preferred targets for such mutations would be the death effector 
domains. Thus, a further aspect of the present invention is an ohgonucleotide probe, 
preferably having a length of from 15-40 bases which specifically and selectively hybridizes 
with the cDNA given by Seq. ID No. 1 or 3 or a sequence complementary thereto. As used 

25 herein, the phrase "specifically and selectively hybridizes with" the cDNA refers to primers 
which will hybridize with the cDNA under stringent hybridization conditions. 

Probes of this type can also be used for diagnostic purposes to characterize risk of 
Huntington's Disease like symptoms arising in individuals where the symptoms are present in 
the family history but are not associated with an expansion of the CAG repeat. Such 

30 symptoms may arise from a mutation in HIPl or other HIP-apoptosis modulating protein 
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which alters the interaction of this protein with huntingtin, thereby increasing the apoptotic 
activity of the protein even in the presence of a normal (non-expanded) huntingtin raolcculc. 
An appropriate probe for this purpose would one which hybridizes with or adjacent to the 
huntingtin binding region of the HIP-apoptosis modulating protein. In HlPl, this lies within 
5 amino acids 129-514. 

DNA sequencing of the HJPl cDNA initially isolated from the yeast two-hybrid 
screen (Seq. ID No. 1) revealed a 1.2 kb cDNA that shows no significant degree of nucleic 
acid identity with any stretch of DNA using the blastn program at ncbi 
(blast@ncbi.nlm.nih.gov). When the larger HlPl cDNA sequence (SEQ ID NO. 3) was 

10 translated into a polypeptide, the HIPl cDNA coding (nucleotides 328-3069) is observed to 
be devoid of stop codons, and to produce a 914 amino acid polypeptide. A polypeptide 
identity search revealed an identity match over the entire length of the protein (46% 
conservation) with that of a hypothetical protein from C. elegans (ZK370.3 protein; C 
elegans cosmid ZK370). This C. elegans protein shares identity with the mouse talin gene, 

15 which encodes a 21 7 kDa protein implicated with maintaining integrity of the cytoskeleton. 
It also shares identity with the SLA2/MOP2/ END4 gene from Saccharomyces cerevisiae^ 
which is known to code for an essential cytoskeletal associated gene required for the 
accumulation and or maintenance of plasma membrane H"- ATPase on the cell surface. 
When pairwise comparisons are performed between HIPl and the C. elegans ZK370.3 protein 

20 (Genpept accession number celzk370.3), it shows 26% complete identity and an overall 46% 
level of conservation. Comparative analysis between HIPl and SLA2/MOP2/ END4 (EMBL 
accession number Z2281 1) demonstrate similar conservation (20% identity, 40% 
conservation). 

Further exploration revealed several important facts about HIPl that implicate it in a 
25 significantly in the pathogenesis of Huntington's Disease. First, as shown in Fig. 1 , it was 
found that the native interaction between HD protein and HIPl is influenced by the number 
of CAG repeats. Second, it was found that expression of the HIPl protein is enriched in the 
brain. The highest amounts of expression are in the cortex, with lower levels being seen in 
the cerebellum, caudate and putamen. 
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It has also been observed that huntingtin proteins with expanded polyglutamine tracts 
can aggregate into large, irregularly shaped deposits in HD brains, transgenic mice and in 
vitro cell culture. We have shown that in HEK (human embryonic kidney) 293T cells, the 
aggregation of full-length and smaller huntingtin fragments occurs after the cells have been 
5 exposed to a period of apoptotic stress. Martindale, et al.. Nature Genetics X 8: 1 50-154 
(1998). In order to assess the consequence of HIP 1 expression in cultured cells, we used 
huntingtin aggregation as one marker of viability. What we found was that cells 
cotransfected with huntingtin (128 CAG repeats) and HIPl contained aggregates comparable 
to those observed following application of apoptotic stress with sub-!ethal doses of tamoxifen 

10 in 14% of the cells, and that these cells were the ones in which both genes had been 

introduced as reflected by a double marker experiment. Transfection of a gene encoding a 
fusion protein of 128 repeat huntingtin and the DED domain from HIPl ligated in the sense 
orientation resulted in aggregate formation in 30 to 50% of the cells. 

The implications of the apoptotic activity of HIPl are two-fold. First, the fact that 

15 this activity is apparently differentially modulated by interaction with huntingtin having 
normal and expanded repeats implicates HIPl in the apoptotic neuronal death which is 
observed in Huntington's disease and makes HIPl a logical target for therapy. A second 
implication of the apoptotic activity of HIPl is the potential for use of HIPl as a therapeutic 
agent to introduce apoptosis in cancer cells. 

20 Therapeutic targeting of HIP 1 or other HIP-apoptosis modulating proteins might take 

any of several forms, but will in general be a treatment involving administration of a 
composition that reduces the apoptotic activity of the HIP-apoptosis modulating protein. As 
used in the specification and claims hereof, the term "administration" includes direct 
administration of a composition active to reduce apoptotic activity as well as indirect 

25 administration which might include administration of pro-drugs or nucleic acids that encode 
the desired therapeutic composition. 

One class of composition which can be used in the therapeutic methods of the 
invention are those compositions which interfere with the activity of HIP-apoptosis 
modulating proteins by binding to the proteins and mask and reduce the interaction of HIP- 

30 apoptosis modulating proteins with the death signaling complex. Within this class of 
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compositions are normal (non-expanded) himtingtin, administered, for example, via increased 
expression of exogenous HD genes; the HIP-binding region of huntingtin, administered via 
gene therapy techniques; and other DED-interacting peptides. Other DED-interacting 
peptides which might be used in a therapeutic method of this type include FADD (Beldin et 
5 ai.. Cell 85: 803-815 (1996)) and caspase 8 (Muzio et al.. Cell 85: 817-827 (1996). 

An ahemative form of therapy involves the use of a mutant form of HIP 1 or other 
HP-apoptosis modulating protein from which the DED has been deleted. DED-containing 
proteins, including HIPl are self-associating, and this self-association has been shown to be 
important for activity. (Muzio et al, Cell 85: 817-827 (1996). Thus, a protein with a deleted 

10 DED may compete with endogenous HIP-protein in the formation of these associations, 
thereby reducing the amount of apoptotically-active HIP-protein. 

In addition to HIPl, we have identified a further human protein, HIPla, from a 
human frontal cortex cDNA library. HIPla is a family member of HIPl, and thus a HIP- 
apoptosis modulator in accordance with the invention. A partial sequence of HIPla (the 5' 

15 portion of HIPla remains to be characterized) is given by SEQ ID Nos. 6 and 7. The isolated 
and characterized portion of HIPla shows 53% nucleotide identity and 58% amino acid 
conservation with HIPl (Table 1, Figs. 2 and 3), 

We have also isolated 2 mouse proteins mFQPl and mHIPla (SEQ. ID Nos. 8-11) 
which appear to be the murine homologues of human HIPl and HIPla. As in the case of 

20 human HIPla, the 5' portion of mHIPl remains to be isolated. At present, mHIPl shows 85% 
nucleotide identity and 90% amino acid conservation with huHIPl (Tabic 1, Figs. 2 and 3). 
mHIPl a shows 60% nucleotide identity and 61% amino acid conservation with huHIPl 
(Table 1, Figs. 2 and 3). mHIPl a shows stronger homology to huHIPla; it shows 87% 
nucleotide identity and 91% amino acid conservation with huHIPla (Table 1, Figs. 2 and 3). 

25 Taken together these findings indicate that mHIPl is the murine homologue of huHIPl 

whereas mHIPl a is most likely the murine homologue of huHIPla. As mentioned previously, 
HEPl shows sequence similarity to Sla2p in S. cerevisiae and the hypothetical protein 
ZK370.3 in C. elegans. Similarly, huHIPla, mHIPl, and mHIPl a show sequence similar to 
Sla2p and ZK370.3 (Table 2). The carboxy-terminal regions of huHIPla, mHIPl, and 

30 mHIP 1 a all show considerable homology to the mammalian membrane 
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cytoskeletal-associated protein, talin. This suggests that these 3 proteins may also play a role 
in the regulation of membrane events through interactions with the underlying cytoskeleton. 

HlPl contains a death effector domain (DED), a domain which is also present in a 
number of proteins involved in the apoptotic pathway (Fig. 4). This suggests that HIPl may 
5 act as a modulator of the apoptosis pathway. The DED in huHIPl is present between amino 
acid positions 287 and 368. Similarly, HIPla, mHIPl, and mHIPla also contain a DED. In 
huHIPlathe DED is present at amino acids 1-78 of the recovered firagment. In mHIPl and 
mHIPla, the DED are present at amino acids 128-210 and 388-470, respectively. The DED 
present in huHIPla, mHIPl and mHIPla all show significant percentage amino acid 

10 conservation to the DED present in huHIPl (Table 3). 

Increasing expression of normal (non-expanded) huntingtin or the HIP-apoptotic 
modulator-binding portion thereof, a modified HIP-apoptotic modulator in which the DED 
has been deleted or of a DED-interacting protein or peptide can be accomplished using gene 
therapy approaches. In general, this will involve introduction of DNA encoding the 

15 appropriate protein or peptide in an expressable vector into the brain cells. Expression of 
HIP-apoptosis modulating proteins may also be useful in treatment of cancer in which case 
;q)plication to other cell types would be desired, and cells expressing HIP-apoptosis 
modulating proteins may be used for screening of therapeutic compounds. Thus, in a more 
general sense, expression vectors are defined herein as DNA sequences that are required for 

20 the transcription of cloned copies of genes and the translation of their mRNAs in an 

appropriate cell type. Specifically designed vectors allow the shuttling of DNA between 
hosts such as bacteria- yeast or bacteria-animal cells. An appropriately constructed expression 
vector may contain: an origin of replication for autonomous replication in host cells, 
selectable markers, a limited number of useful restriction enzyme sites, a potential for high 

25 copy number, and active promoters. A promoter is defined as a DNA sequence that directs 
RNA polymerase to bind to DNA and initiate RNA synthesis. A strong promoter is one 
which causes mRNAs to be initiated at high frequency. Expression vectors may include, but 
are not limited to, cloning vectors, modified cloning vectors, specifically designed plasmids 
or viruses. 

30 A variety of mammalian expression vectors may be used to express recombinant 
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HEP-apoptosis modulating proteins or fragments thereof in mammalian cells. Commercially 
available mammalian expression vectors which may be suitable for recombinant HEP- 
apoptosis modulating protein expression, include but are not limited to, pMClneo 
(Stratagene), pXTl (Stratagene), pSG5 (Stratagene), EB0-pSV2-neo (ATCC 37593) 
5 pBPV-l(8-2) (ATCC 371 10), pdBPV-MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 
37199), pRSVneo (ATCC 37198), pSV2-dhfr (ATCC 37146), pUCTag (ATCC 37460), and 
1ZD35 (ATCC 37565). Other vectors which have been shown to be suitable expression 
systems in mammahan cells include the herpes simplex viral based vectors: pHSVl (Geller 
etal. Proc. Natl. Acad. Sci 87:8950-8954 (1990)); recombinant retroviral vectors: MFC 

10 (Jaffeeetal. Cancer Res. 53:2221-2226 (1993)); Moloney-based retroviral vectors: LN, 

LNSX, LNCX, LXSN (Miller and Rosman Biotechniques 7:980-989 (1989)); vaccinia viral 
vector: MVA (Sutter and Moss Proc. Natl. Acad. Sci. 89:10847-10851 (1992)); recombin- 
ant adenovirus vectors : pJM17 (Ali et al Gene Therapy 1 :367-384 (1994)), (Berkner K. L. 
Biotechniques 6:616-624 1988); second generation adenovirus vector: DE1/DE4 adenoviral 

15 vectors (Wang and Finer Nature Medicine 2:714-716 (1996) ); and Adeno-associatcd viral 
vectors: AAV/Neo (Muro-Cacho et al. J. Immunotherapy 11:231-237(1992)). 

The expression vector may be introduced into host cells via any one of a number of 
techniques including but not limited to transformation, transfcction, infection, protoplast 
fusion, and electroporation. The expression vector-containing cells are clonally propagated 

20 and individually analyzed to determine whether they produce the desired protein. Delivery of 
retroviral vectors to brain and nervous system tissue has been described in US Patents Nos. 
4,866,042, 5,082,670 and 5,529,774, which are incorporated herein by references. These 
patents disclose the use of cerebral grafts or implants as one mechanism for introducing 
vectors bearing therapeutic gene sequences into the brain, as well as an approach in which the 

25 vectors are transmitted across the blood brain barrier. 

To further illustrate the methods of making the materials which are the subject of this 
invention, and the testing which has established their utility, the following non-limiting 
experimental procedures are provided. 
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EXAMPLE 1 

IDENTIFICATION OF INTERACTING PROTEINS 
GAL4-HD cDNA constructs 

An HD cDNA construct (44pGBT9), with 44 CAG repeats was generated 
5 encompassing amino acids 1 - 540 of the published HD cDNA . This cDNA fragment was 
fiised in frame to the GAL4 DNA-binding domain (BD) of the yeast two-hybrid vector 
pGBT9 (Clonlech). Other HD cDNA constructs, 16pGBT9, 80pGBT9 and 128pGBT9 were 
constructed, identical to 44pGBT9 but included only 16, 80 or 128 CAG repeats, 
respectively. 

10 Another clone (DMKDBamHIpGBT9) containing the first 544 amino acids of the 

myotonic dystrophy gene (a gift from R. Komeluk) was ftised in-frame with the GAL4-DNA 
BD of pGBT9 and was used as a negative control Plasmids expressing the GAI>4-BDRAD7 
(D. Gietz, unpublished) and SIR3 were used as a positive control for the p-galactosidasc filter 
assay. 

15 The clones IT15-23Q, IT15-44Q and HAPl were generous gifts from Dr. C. Ross. 

These clones represent a previously isolated himtingtin interacting protein that has a higher 
affinity for the expanded form of the HD protein. 

Yeast strains, transformations and B-ealactosidase assays 
20 The yeast strain Y190 (MATa leu2-3,l 12, ura3-52, trpl-901, his3-A200, ade2-101, 

gaI4Agal80A, URA3::GAL-lacZ, LYS2::GAL-mS3,cycO was used for all transformations 
and assays. Yeast transformations were performed using a modified lithium acetate 
transfonnation protocol and grown at 30 C using appropriate synthetic complete (SC) dropout 
media. 

25 The p-galactosidase chromogenic filter assays were performed by transferring the 

yeast colonies onto Whatman filters. The yeast cells were lysed by submerging the filters in 
liquid nitrogen for 15-20 seconds. Filters were allowed to dry at room temperature for at 
least five minutes and placed onto filter paper presoaked in Z-buffer (100 mM sodium 
phosphate (pH7.0) 10 mM KCl, 1 mM MgSO^) supplemented with 50 mM 
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2-mercaptoethanol and 0.07 mg/ml 5-bromo-4-chIoro-3-indolyl p-D-galactoside (X-gal). 
Filters were placed at 37 C for up to 8 hours. 

Yeast two-hybrid screening for huntingtin interacting protein (HIP) 
5 cDNAs from an human adult brain Matchmaker™ cDNA library (Clontech) was 

transformed into the yeast strain Y190 ahready harboring the 44pGBT9 construct. The 
transformants were plated onto one hundred 150 mm x 15 mm circular culture dishes 
containing SC media deficient in Trp, Leu and His. The herbicide 3-amino-triazoIe (3-AT) 
(25 mM) was utilized to Hmit the number of false His+ positives (31). The yeast 

10 transformants were placed at 30 C for 5 days and p-galactosidase filter assays were performed 
on all colonies found after this time, as described above, to identify p-galactosidase+ clones. 
Primary His+/p-galactosidase+ clones were then orderly patched onto a grid on SC 
-Trp/-Leu/-His (25 mM 3AT) plates and assayed again for His+ growth and the ability to turn 
blue with a filter assay. Secondary positives were identified for further analysis. Proteins 

15 encoded by positive cDNAs were designated as HIPs (Huntingtin Interactive Proteins). 

Approximately 4.0 x 10^ Trp/Leu auxotrophic transformants were screened and of 14 clones 
isolated 12 represented the same cDNA (HIPl), and the other 2 cDNAs, HIP2 and HIP3 were 
each represented only once. 

The HIP cDNA plasmids were isolated by growing the His+/p-galactosidase-i- colony 

20 in SC -Leu media overnight, lysing the cells with acid-washed glass beads and 

electroporating the bacterial strain, KC8 (leuB auxotrophic) with the yeast lysate. The KC8 
ampicillin resistant colonies were replica plated onto M9 (-Leu) plates. The plasmid DNA 
from M9+ colonies was transformed into DH5-a for further manipulation. 

25 EXAMPLE 2 

CONFIRMATION OF INTERACTIONS 
The HIP1-GAL4-AD cDNA activated both the lac-Z and His reporter genes in the 
yeast strain Y190 only when co-transformed with the GAL4-BD-HD construct, but not the 
negative controls (Fig. 1 ) of the vector alone or a random fusion protein of the myotonin 
30 kinase gene. In order to assess the influence of the polygiutamine tract on the interaction 
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between HIPl and HD, semi-quantitative p-galactosidase assays were performed. 
GAM-BD-HD fusion proteins with 16, 44, 80 and 128 glutamine repeats were assayed for 
their strength of interaction with the GAM- AD-HIP 1 fusion protein. 

Liquid P-galactosidase assays were performed by inoculating a single yeast colony 
5 into appropriate synthetic complete (SC) dropout media and grown to OD600 0.6-1 .5. Five 
millilitres of overnight culture was pelleted and washed once with 1 ml of Z-Buffer, then 
resuspended in 100 ml Z-Buffer supplemented with 38 mM 2-mercaptoethanol, and 0.05% 
SDS. Acid washed glass beads (-100 ml) were added to each sample and vortexed for four 
minutes, by repeatedly alternating a 30 seconds vortex, with 30 seconds on ice. Each sample 

10 was pelleted and 10 ml of lysate was added to 500 ml of lysis buffer The samples were 
incubated in a 30 C waterbath for 30 seconds and then 100 ml of a 4 mg/ml o-nitrophenyl 
b-D galactopyranoside (ONPG) solution was added to each tube. The reaction was allowed 
to continue for 20 minutes at 30 C and stopped by the addition of 500 ml of 1 M Na2C03 and 
placing the samples on ice. Subsequently, OD420 was taken in order to calculate the p- 

15 galactosidase activity with the equation 1000 x OD420/(t x V x OD600) where t is the 
elapsed time (minutes) and V is the amount of lysate used. 

The specificity of the HIPl-HD interaction can be observed using the chromogenic 
filter assay. Only yeast cells harboring HIPl and HD activate both the HIS and lacZ reporter 
genes in the Y 190 yeast host. The cells that contain the HIPl with HD constructs with 80 or 

20 128 CAG repeats turn blue approximately 45 minutes after the cells with the smaller sized 
repeats (16 or 44). 

No difference in the p-galactosidase activity was observed between the 16 and 44 
repeats or between the 80 and 128 repeats. However, a significant difference (p<0.05) in 
activity is seen between the smaller repeats (16 and 44) and the larger repeats (80 and 128). 
25 (Figure 1) 



EXAMPLE 3 

DNA SEQUENCING, cDNA ISOLATION AND 5' RACE 
Oligonucleotide primers were synthesized on an ABI PCR-mate oligo-synthesizer. 
30 DNA sequencing was performed using an ABI 373 fluorescent automated DNA sequencer. 
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The HIP cDNAs were confirmed to be in-framc with the GAM- AD by sequencing across the 
AD-HIPl cloning junction using an AD oHgonucleotide (5'GAA GAT ACC CCA CCA 
AAC3'). (Seq.IDNo. 12) 

Subsequently, primer walking was used to determine the remaining sequences. A 
5 human frontal cortex >4.0 kb cDNA library (a gift from S. Montal) was screened to isolate 
the full length HIPl gene. Fifty nanograms of a 558 base pair Eco RI fragment from the 
original HIPl cDNA was radioactively labeled with [a^^P]-dCTP using nick-translation and 
the probe allowed to hybridized to filters containing >105 pfii/ml of the cDNA library 
overnight at 65 ""C in Church buffer (see Northern blot protocol). The filters were washed at 

10 65 ''C for 10 minutes with 1 X SSPE, 15 minutes at 65 C with 1 X SSPE and 0.1% SDS, then 
for thirty minutes and fifteen minutes with 1 X SSPE and 0.1% SDS. The filters were 
exposed to X-ray film (Kodak, XAR5) overnight at -70 C. Primary positives were isolated 
and replated and subsequent secondary positives were hybridized and washed as for the 
primary screen. The resulting positive phage were converted into plasmid DNA by 

15 conventional methods (Stratagcne) and the cDNA isolated and sequenced. 

In order to obtain the most 5' sequence of the HIPl gene, a Rapid Amplification of 
cDNA Ends (RACE) protocol was performed according to the manufacturers 
recommendations (BRL). First strand cDNA was synthesized using the oligo HIP1-242R (5' 
GCT TGA CAG TGT AGT CAT AAA GGT GGC TGC AGT CC 3'). (Seq. ID No. 13) 

20 After dCTP tailing the cDNA with terminal deoxy transferase, two rounds of 35 cycles 
(94°C 1 minute; 53X 1 minute; 72^C 2 minutes) of PCR using HIP1-R2 (5' GGA CAT 
GTC CAG GGA GTT GAA TAC 3') (Seq. ID No. 14) and an anchor primer (5' (CUA)4 
GGC CAC GCG TCG ACT AGT ACG GGI IGG Gil GGG IIG3') (BRL ,Seq. ID No. 15)) 
were performed. The subsequent 650 base pair PCR product was cloned using the TA 

25 cloning system (Invitrogen) and sequenced using T3 and T7 primers. Sequences ID Nos. 1 
and 3 show the sequence of the HIPl cDNAs obtained. 
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EXAMPLE 4 
DNA AND AMINO ACID ANALYSES 
Overl^ping DNA sequence was assembled using the program Mac Vector and sent 
via email or Netscape to the BLAST server at NIH (http;//www.ncbi.nIm.mh.gov) to search 
5 for sequence similarities with known DNA (blastn) or protein (tblastn) sequences. Amino 
acid alignments were performed with the program Clustalw, 

EXAMPLE 5 

FISH DETECTION SYSTEM AND IMAGE ANALYSIS 
10 The HIPl cDNA isolated firom the two-hybrid screen was mapped by fluorescent in 

situ hybridization (FISH) to normal human lymphocyte chromosomes counterstained with 
propidium iodide and DAPI. Biotinylated probe was detected with avidin-fluorescein 
isothiocyanatc (FITC), Images of metaphase preparations were captured by a 
thcnnoelectrically cooled charge coupled camera (Photometries). Separate images of DAPI 
15 banded chromosomes and FITC targeted chromosomes were obtained. Hybridization signals 
were acquired and merged using image analysis software and pseudo colored blue (DAPI) 
and yellow (FITC) as described and overlaid electronically. This study showed that HIPl 
maps to a single genomic locus at 7ql 1 .2. 

20 EXAMPLE 6 

NORTHERN BLOT ANALYSIS 
RNA was isolated using the single step method of homogcnization in guanidinium 
isothiocyantc and fractionated on a 1 .0% agarose gel containing 0.6 M formaldehyde. The 
RNA was transferred to a hybond N -membrane (Amersham) and crossl inked with ultraviolet 
25 radiation. 

Hybridization of the Northern blot with b-actin as an internal control probe 
provided confinnation that the RNA was intact and had transferred. The 1.2 kb HIPl cDNA 
was labeled using nick translation and incorporation of a^^P-dCTP. Hybridization of the 
original 1.2 kb HIPl cDNA was carried out in Church buffer (0.5 M sodium phosphate 
30 buffer, pH 7.2, 2.7% sodium dodecyl sulphate, 1 mM EDTA) at 55 C overnight. Following 
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hybridization, Northern blots were washed once for 10 minutes in 2.0 X SSPE, 0.1% SDS at 
room temperature and twice for 1 0 minutes in 0. 1 5 X SSPE, 0. 1 % SDS. Autoradiography 
was carried our from one to three days using Hyperfilm (Amersham) film at -70 C. 

Analysis of the levels of RNA levels of HIP 1 by Northern blot data revealed that the 
5 10 kilo base HIPl message is present in all tissue assessed. However, the levels of RNA are 
not uniform, with brain having highest levels of expression and peripheral tissues having less 
message. No apparent differences in RNA expression was noted between control samples 
and HD affected individuals. 



10 EXAMPLE 7 

TISSUE LOCALIZATION OF HIPl 
Tissue localization of HIPl was studied using a variety of techniques as described 
below. Subcellular distribution of HIP- 1 protein in adult human and mouse brain Biochem- 
ical fractionation studies revealed the HIPl protein was found to be a membrane-associated 

15 protein. No immunoreactivity was seen by Western blotting in cytosolic fractions, using the 
anti-HIPl-pepl polyclonal antibody. HIPl immunoreactivity was observed in all membrane 
fractions including nuclei (PI), mitochondria and synaptosomes (P2), microsomes and 
plasma membranes (P3). The P3 fraction contained the most HIPl compared to other 
membrane fractions. HIPl could be removed from membranes by high sah (0-5M NaCl) 

20 buffers indicating it is not an integral membrane protein, however, since low salt (0.1- 0.25M 
NaCl) was only able to partially remove HIPl from membranes, its membrane association is 
relatively strong. The extraction of P3 membranes with the non-ionic detergent, Triton 
X-100 revealed HIPl to be a Triton X-100 insoluble protein. This characteristic is shared by 
many cytoskeletal and cytoskeletal-associated membrane proteins including actin, which was 

25 used as a control in this study. The biochemical characteristics of HIPl described were found 
to be identical in mouse and human brain and was the same for both forms of the protein 
(both bands of the HIPl doublet). HIPl co-localized with huntingtin in the P2 and P3 
membrane fractions, including the high-salt membrane extractions, as well as in the Triton 
X-lOO insoluble residue. The subcellular distribution of HIPl was unaffected by the 
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expression of polyglutamine-expanded huntingtin in transgenic mice and HD patient brain 
samples. 

The iocalization of HIP 1 protein was further investigated by immunohistochemistry in 
normal adult mouse brain tissue. Immunoreactivity was seen in a patchy, reticular pattern in 
5 the cytoplasm, appeared excluded from the nucleus and stained most intensely in a 

discontinuous pattern at the membrane. These results are consistent with the association of 
HIPl with the cytoskeletal matrix and further indicate an emichment of HEP 1 at plasma 
membranes. Immunoreactivity occurred in all regions of the brain, including cortex, 
striatum, cerebellum and brainstem, but appeared most strongly in neurons and especially in 
10 cortical neurons. As described previously, huntingtin immunoreactivity was seen exclusively 
and uniformly in the cytosol. 

The in situ hybridization studies showed HIPl mRNA to be ubiquitously and 
generally expressed throughout the brain. This data is consistent with the immunohisto- 
chemical results and was identical to the distribution pattern of huntingtin mRNA in 
15 transgenic mouse brains expressing full-length human huntingtin. 

Protein Preparation And Western Blotting For Expression Studies 

Frozen human tissues were homogenized using a Polytron in a buffer containing 
0.25M sucrose, 20mM Tris-HCI (pH 7.5), lOmM EGTA, 2mM EDTA supplemented with 

20 lOug/ml of leupeptin, soybean trypsin inhibitor and ImM PMSF, then centrifuged at 

4,000rpm for 10* at 4 C to remove cellular debris. 100-150ug/lane of protein was separated on 
8% SDS-PAGE mini -gels and then transferred to PVDF membranes. Huntmgtin and HIPl 
were electroblotted overnight m Towbin's transfer buffer (25 mM Tris-HCI, 0.1 92M glycine, 
pH8.3, 10% methanol) at 30V onto PVDF membranes (Immobilon-P, Milhpore) as described 

25 (Towbin et al, Proc. Nat V Acad. Sci. (USA) 76: 4350-4354 ( 1 979)). Membranes were blocked 
for 1 hour at room temperature in 5% skim milk/ TBS (lOmM Tris-HCI, 0.1 5M NaCl, 
pH7.5), Antibodies against huntingtin (pAb BKPl, 1:500), actin (mAb A-4700, Sigma, 
1:500) or HIPl (pAb HlP-pepl, 1 :200) were added to blocking solution for 1 hour at room 
temperature. After 3x10 mmutes washes m TBS-T (0.05% Tween-20/TBS), secondary Ab 

30 (horseradish peroxidase conjugated IgG, Biorad) was applied in blocking solution for 1 hour 
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at room temperature. Membranes were washed and then incubated in chemiluminescent ECL 
solution and visualized using Hyperfilm-ECL film (Amersham). 

Generation of Antibodies 
5 The generation of huntingtin specific antibodies GHMl and BKPl is described 

elsewhere (Kalchman, et al., y. 5/o/. Chem. 271: 19385-19394(1996)). TheHIPl peptide 
(VLEKDDLMDMDASQQN, a.a. 76-91 of Seq. ID No. 2) was synthesized with Cys on the 
N-tciminus for the coupling, and coupled to Keyhole limpet hemocyanin (KLH) (Pierce) 
with succinimidy 1 4-(N-maleimidomethyI) cyclohexame- 1 -carboxylate (Pierce) . Female 
10 New Zealand While labbits were injected with HIPl peptide-KLH and Freund's adjuvant. 
Antibodies against the HIPl peptide were purified from rabbit sera using affmity column 
with low pH elution. Affinity column was made by incubation of HIPl peptide with 
activated thio-Sepharose (Pharmacia). 

Western blotting of various peripheral and brain tissues were consistent with the RNA 
15 data. The HIPl protein levels observed was not equivalent in all tissues. The protein 

expression is predominant in brain tissue, with highest amounts seen in the cortex and lower 
levels seen in the cerebellum and caudate and putamen. 

More regio-specific analysis of HIPl expression in the brain revealed no differential 
expression pattern in affected individuals when compared to normal controls, with highest 
20 levels of expression seen in both controls and HD patients in the cortical regions. 

EXAMPLE 8 

CQ-IMMUNOPRECIPITATION OF HIPl WITH HUNTINGTIN 
Confirmation of the HD-HIPl interaction was performed using coimmunoprepitation 
as follows. Control human brain (frontal cortex) lysate was prepared in the same maimer as 
25 for subcellular localization study. Prior to immunoprecipitation, tissue lysate was 

centrifuged at 5000 rpm for 2 minutes at 4 C, then the supernatant was pre-cleared by the 
incubated with excess amount of Protein A-Sepharose for 30 minutes at 4** C, and 
centrifuged at the same condition. Fifty microlitres of supernatant (500 mg protein) was 
incubated with or without antibodies (10 ug of anti-huntingtin GHMl (Kalchman, et al. 1996) 
30 or anti-synaptobrevin antibody) in the total 500 ul of incubation buffer (20mM Tris-Cl 
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(pH7.5). 40mM NaCl, ImM MgOj) for 1 hour at 4°C. Twenty microlitres of Protein 
A-Sqjharose (1:1 suspension, for GHMl and no antibody control) or Protein G-Sephaiose 
(for anti-synaptobrevin antibody; Pharmacia) was added and incubated for 1 hour at 4°C. 
The beads were washed with washing buffer (incubation buffer containing 0.5 % Triton 
5 X-100) three times. The samples on the beads were separated using SDS-PAGE (7.5% 

acrylamide) and transferred to PVDF membrane (Immobilon-P, Millipore). The membrane 
was cut at about 1 50 kDa after transfer for Western blotting (as described above). The upper 
piece was probed with anti-huntingtin BKPl (1/1000) and lower piece with anti-HIPl 
antibody (1/300). 

10 The results showed that when an anli-HIPl polyclonal antibody was immunoreacted 

against a blot containing the GHMl irarauno precipitates from the brain lysate a doublet was 
observed at approximately 100 kDa. When GHMl was immunoreacted against the same 
immunoprecipitate the 350 kDa HD protein was also seen The specificity of the HD-HIPl 
interaction is seen as no immunoreactive bands seen are as a result of the proteins adsorbing 

15 to the Protein- A-Scpharosc (Lysate + No Antibody) or when a random, non related antibody 
(Lysate + anti-Synaptobrevin) is used as the immunoprecipitating antibody. 



EXAMPLE 9 

Subcellular fractionation of brain tissue 
20 Cortical tissue (20-1 00 mg/ml) was homogenized, on ice, in a 2 ml pyrex-teflon 

IKA-RW15 homogcnizer (Tekmar Company) in a buffer containing 0.303M sucrose, 20naM 
Tris-HCl pH 6.9, ImM MgClz, 0.5mM EDTA, ImM PMSF, ImM leupeptin, soybean 
trypsin inhibitor and ImM benzamidine (Wood et al.. Human Molec. Genet. 5: 481-487 
(1996)). 

25 Crude membrane vesicles were isolated by two cycles of a three-step differential 

centrifugation protocol in a Beckman TLA 120.2 rotor at 4 C based on the methods of Wood 
et al (1996). The first step precipitated cellular debris and nuclei from tissue homogenates for 
5 minutes at 1300 x g (PI ). The 1300 x g supernatant was subsequently centrifuged for 20 
minutes at 14 000 x g to isolate synaptosomes and mitochondria (P2). Finally, microsomal 
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and plasma membrane vesicles were collected by a 35 minute centrifugation at 142 000 x g 
(P3)- The remaining supernatant was defined as the cytosolic fraction. 

High salt extraction of membranes 
5 Aiiquots of P3 membranes were twice suspended at 2mg/ ml in 0.5M NaCl, lOmM 

Tris-HCl, 2mM MgCIj, pH7.2, containing protease inhibitors (see above). The same buffer 
without NaCl was used as a control The membrane suspensions were incubated on ice for 30 
minutes and then centrifuged at 142 000 x g for 30 minutes. 

10 Extraction of cytoskcletal and cytoskeletal-associated proteins. 

To extract cytoskeletal proteins, crude membrane vesicles from the P3 fraction 
membrane were suspended in a volume of Triton X-100 extraction buffer to give a protein: 
detergent ratio of 5: 1 . The composition of the Triton X-100 extraction buffer was based on 
the methods of Arai et al., J. Neuroscience 38: 348-357 (1994) and contained 2% Triton 

15 X-100, lOmM Tris-HCl, 2mM MgClj, ImM leupeptin, soybean trypsin inhibitor, PMSF and 
benzamidine. Membrane pellets were suspended by hand with a round-bottom teflon pestle, 
and placed on ice for 40 minutes. Insoluble cytoskeletal matrices were precipitated for 35 
minutes at 142 000 x g in a Beckman TLA 120.2 rotor. The supernatant was defined as 
non-cytoskeletal -associated membrane or membrane—associated protein and was removed. 

20 The remaining pellet was extracted with Triton X-1 00 a second time using the same 

conditions. We defined the final pellet as cytoskcletal and cytoskeletal-associated protein. 

Solubilization of protein and analysis by SDS-PAGE and Western Blotting 

Membrane and cytoskeletal protein was solubilized in a minimum volume of 1% 

25 SDS, 3M urea, 0.1 mM dithiothreitol in TBS buffer and sonicated. Protein concentration was 
determined using the BioRad DC Protein assay and samples were diluted at least 1 X with 5 
X sample buffer (250mM Tris-HCl pH 6.8, 10% SDS, 25% glycerol, 0.02% bromophenol 
blue and 7% 2-mercaptoethanol) and were loaded on 7.5% SDS-PAGE gels (Bio-Rad 
Mini-PROTEIN II Cell system) without boiling. Western blotting was performed as 

30 described above. 
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Immunohistochemistry 

Brain tissue was obtained from a normal C57BL/6 adult (6 months old) male mouse 
sacrificed with chloroform then perflision-fixed with 4% v/v paraformaldehyde/0.01 M 
phosphate buffer (4% PFA). The brain tissues were removed, immersion fixed in 4% PFA 
5 for 1 day, washed in 0.0 IM phosphate buffered saline, pH 7.2 (PBS) for 2 days, and then 
equilibrated in 25% w/v sucrose PBS for 1 week. The samples were then snap-frozen in 
Tissue Tek molds by isopentane cooled in liquid nitrogen. After warming to -20 C, frozen 
blocks derived from frontal cortex, caudate/putamen, cerebellum and brainstem were cut into 
14 mm sections for immunohistochemistry. Following washing in PBS, the tissue sections 

10 were blocked using 2.5% v/v normal goat serum for 1 hour at room temperature. Primary 

antibodies diluted with PBS were applied to sections overnight at 4 C. Optimal dilutions for 
the polyclonal antibodies BKPl and HIPl were 1:50. Using washes of 3 x 5 minutes in PBS 
at room temperature, sections were sequentially incubated with biotinylated secondary 
antibody and then an avidin-biotin complex reagent (Vccta Stain ABC ICit, Vector) for 60 

15 minutes each at room temperature. Color was developed using 3-3'-diaminobenzidine 
tetrahydrocholoride and ammonium nickel sulfate. 

For controls, sections were treated as described above except that HIPl antibody 
aliquots were preabsorbed with an excess of HIPl peptide as well as a peptide unrelated to 
HIPl prior to incubation with the tissue sections. 

20 

In situ hybridization 

In situ hybridization was performed as previously described with some modification 
(Suzuki et al, BBRC 219: 708-713 (1996)). The RNA probes were prepared using the 
plasmid gtl49 (Lin, B., et al., Human Molec. Genet. 2: 1541-1545 (1994)) or a 558 subclone 

25 of HIPl . The anti-sense and sense single-stranded RNA probes were synthesized using T3 
and T7 RNA polymerases and the In Vitro Transcription Kit (Clontech) with the addition of 
[a-'^SJ-CTP (Amersham) to the reaction mixture. Sense RNA probes were used as negative 
controls. For HIPl studies normal C57BL/6 mice were used. Huntingtin probes were tested 
on two different transgenic mouse strains expressing full-length huntingtin, cDNA HD 10366 

30 (44CAG) C57BL/6 mice and YAC HD10366(1 8CAG) FVB/N mice. Frozen brain sections 
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(lOum thick) were placed onto silane-coated slides under RNase-free conditions. The 
hybridization solution contained 40% w/v formamide, 0.02M Tris-HCl (pH 8,0), 0.005M 
EDTA, 0.3 M NaCi, O.OIM sodium phosphate (pH 7.0), Ix Denhardt's solution, 10% w/v 
dcxtran sulfate (pH 7.0), 0.2% w/v sarcosyl, yeast tRNA (500mg/ml) and salmon sperm DNA 
5 (200mg/ml). The radiolabelled RNA probe was added to the hybridization solution to give 1 
X 106 cpm/lOO ul/ section. Sections were covered with hybridization solution and incubated 
on formamide paper at 65 C for 18 hours. After hybridization, the slides were washed for 30 
minutes sequentially with 2x SSC, Ix SSC and high stringency wash solution (50% 
fonnamide, 2x SSC and 0. 1 M dithiothreitol) at 65 C, followed by treatment with RNAse A 

10 (Img/ral) at 37 C for 30 minutes, then washed again and air-dried. The shdes were first 

exposed on autoradiographic film (b-max, Amersham, UK) for 48 hours and developed for 4 
minutes in Kodak D-19 followed by a 5 minute fixation in Fuji-fix. For longer exposures, the 
slides were dipped in autoradiographic emulsion (50% w/v in distilled water, NR-2, Konica, 
Japan), air-dried and exposed for 20 days at 4 C then developed as described. Sections were 

15 counterstained with methyl green or Giemsa solutions. 

EXAMPLE 10 

We determined a more precise location of the HIPl gene on chromosome 7 in the 
context of a physical and genetic map of chromosome 7, and determined its genomic 

20 organization. HIPl maps by FISH and RH moping to chromosome band 7ql 1.23, which 
contains the chromosomal region commonly deleted in Williams-Beuren syndrome (WS). 
We used several methods to refine the mapping of HIPl in this region. PCR screening of a 
chromosome 7-YAC-library (Scherer et al, mammalian Genome 3: 179-181 (1992)) with 
primers firom the 3' UTR of HIPl resulted in the identification of only a single positive YAC 

25 clone (HSC7E5 1 2). This YAC clone had previously been shown to map near the Williams 
syndrome commonly deleted region (Osborne et al., Genomics 45: 402-406 (1997)). The 
HIPl cDNA was then used to screen a chromosome 7 specific cosmid library fi-om the 
Lawrence Livennorc National Laboratory (LL07NC01), and the RPCI genomic PI derived 
artificial chromosome (PAC) library (Pieter de Jong, Rosswell Park, Buffalo, NY). Several 

30 PAC and cosmid clones that were already part of pre-assembled contigs in the Williams 
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syndrome region at 7ql 1.23 were identified (Fig 5). Restriction enzyme digestion, blot 
hybridization experiments and PGR screening confirmed that the clones contained the HIPl 
gene. 

We determined the exon-intron boundaries and intron sizes of HIPl . Primers were 

5 designed based on the sequence of the HIPl transcript and used to sequence directly from the 
cosmid, PAC clone and long PGR products from PAC or genomic DNA. Whenever a PGR 
fragment generated was longer than predicted from the cDNA sequence, it was assumed to 
contain an intron. The size of the introns was determined by sequencing the intron directly 
or by PGR amplification of the introns from both genomic DNA and the cosmid or PAG 

10 clone from the region. Three sets of overlapping cosmids and a PAG clone that contain the 
entire coding sequence of HIPl were characterized (Fig 5). Cosmid 181G10 and 250F2 
were digested with EcoRI and cloned into the plasmid bluescript. Further sequences were 
generated from these plasmid subclones. Intron-exon boundary sequences were then 
identified by comparing HIPl genomic and transcript sequence. The gene is contained within 

15 75 kb and comprises 29 exons and 28 introns. The intron-exon boundary sequences are 

shown in Table 4, along with the exon and intron sizes. A graphic summary of these data is 
also shown in Fig. 5. Exons 1 to 28 contained the coding regions. The last and largest 
exon of the HIPl gene was found to contain approximately 7 kb. Most of the intron-exon 
junctions followed the canonical GT-AG rule. An AT was found at the 3' splice site of exon 

20 1 and an AG at the 5' splice site of exon 2. Sequence data from all the exon-intron borders 
of the coding region and 3'-UTR is set forth in Seq. ID Nos. 16-44. (These sequence have 
been deposited with GenBank as Accession Nos. AF052261 to AF052288). 

Sequence analysis of previously published 5' untranslated region (GenBank accession 
U79734) revealed the possibility that the open reading frame extends upstream of the ATG in 

25 the exon 4 to a 5* ATG in exon 1 . Although we failed to obtain any additional 5' sequences 
despite repeated 5' RAGE analyses, an additional ATG, 284 bp upstream of the previously 
published exon 1 is in the same reading frame and has the surrounding sequence of 
TGGGATGTT which is similar to the AGGGATGGG, the consensus Kozak sequence 
(Kozak, M. Nuci Acids Res. 15: 8125-8148 (1987)). If translated from this ATG, the protein 

30 would be highly homologous to the N-terminal portion of ZK370.3 and yeast Sla2 protein 
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(Fig. 6). The translated protein in the region of exons 1 to 3 shows an identity of >40% and 
similarity of >60% to the N-terminal part of ZK370.3. This suggests that the exons 1 to 3 
are probably translated. 

In western blot studies, HIPl is identified as a 120 kd protein (1 1, 23), while the 
5 putative translation of the previously published cDNA gives a protein product of estimated 

molecular weight of approximately 100 kd. If HEP 1 gene were translated from the ATG 284 
bp upstream of the exon 1, the expected product would have an estimated molecular weight 
of 122 kd. RNA PCR studies with primers downstream of this ATG and primers in exon 7 
amplify expected products of 576 and 600 bp. Taken together these data support the 

10 contention that exon 1 extends further 5' and that HIPl gene is translated from the ATG in 

exon 1. Sequence analyses showed no TATA, CAAT box or any GC rich promoter sequence 
upstream of exon 1 ATG. The promoter prediction programs provided by the server 
http://dot.imgen.bcm.tmc.edu: 9331/seq.search/gene.search.html did not predict any promoter 
upstream of the ATG at position -284, (position 0 corresponds to the first nucleotide of 

15 published cDNA, GenBank accession U79734). This suggests that HIPl may have additional 
exons. 

Finally, we evaluated HIPl gene as a candidate gene for Huntington disease in 
families without CAG expansion. In a large study of 1022 patients with a clinical diagnosis 
of HD, no CAG repeat expansion was found in 12 patients who might represent phenocopies 

20 of HD. in at least three families, linkage studies have excluded the HD locus at 4p. 

Mutation in an interacting protein could result in a similar phenotype as illustrated by the 
discovery of mutations in dystrophin associated proteins in muscular dystrophies. A 
mutation in HIPl may result in altered interaction of huntingtin and HIPl and lead to cellular 
toxicity as a result of more HIPl being free in the cytosol. Thus mutations in huntingtin 

25 interacting proteins genes may cause a phenotype suggestive of HD. We studied two of the 
larger families diagnosed with HD without CAG expansion in HD gene, with the highly 
informative marker D7 1 8 1 6 which maps centromeric and very close to HIP 1 gene. The 
clinical findings in both the families were compatible with a diagnosis of HD, although there 
were atypical features. In family 1733, HIPl locus appears to be excluded, as there are two 

30 recombinants with the marker. Individuals II-5 and 1 1-7 who do not share the haplotype with 
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the affected individuals are now 41 and 39 years old and have normal neurological 
examinations. 

In the family 1602, a lod score of 1.92 is obtained with the marker D7S1816 at 6^=0. 
Sequencing of all the coding exons did not reveal any mutation in any exon sequence. The 
5 promoter sequence has not been examined. Subsequently a whole genome scan revealed a 
higher lod scores for markers on chromosome 20p. 

EXAMPLE 11 

A mouse brain lambda ZAPII cDNA library (Stratagene # 93609) was screened with 
10 various mouse ESTs which showed homology to the human HIPl cDNA sequence (see Fig. 
7). The ESTs were initially isolated from the non-redundant Database of GenBank EST 
Division by performing a BLASTN using a fragment of the human HIPl cDNA as the query. 
We obtained 4 different ESTs which showed homology to HIPl: 1) aal 10840 (clone 520282) 
which is 399bp and shows 58% identity, at the nucleotide level, to position 1880 to 2259 of 
15 the HIPl cDNA. 2) w82687 (clone 404331) which is 420bp and shows 66% identity, at the 
nucleotide level, to position 2750 to 2915 of the HIPl cDNA. 3) aal38903 (clone 586510) 
which is 509bp and shows 88%» identity, at the nucleotide level, to position 2763 to 2832 of 
the HIPl cDNA. 4) aa388714 (569088) which is 404bp and shows 88% identity, at the 
nucleotide level, to position 2475 to 2692 of the HIPl cDNA. 

20 

mHIPl : 

Fifty nanograms of a 362bp Kpnl 8l PvuU fragment of clone 569088 (containing EST 
aa388714) was radioactivcly labeled with [32-P]-dCTP using random-priming. The probe 
was allowed to hybridize to filters containing > 2x 10^ pfu/ml of the mouse brain lambda 

25 ZAPII cDNA Ubrary (Stratagene # 93609) overnight at 65 °C in Church buffer (0.5M sodium 
phosphate buffer (pH 7.2), 2.7% SDS, ImM EDTA). The filters were washed at room 
temperature for 15 minutes with 2XSSPE, 0.1% SDS, then at 65°C for 20 minutes with 
IXSSPE, 0.1%SDS and finally twice at 65°C with 0.5 XSSPE, 0.1%SDS. The filters were 
exposed to X-ray film (Kodak, XAR5) overnight at -70 C. Primary positives were isolated, 

30 replated and subsequent secondary positives were hybridized and washed as for the primary 
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screen. The resulting positive phage was converted into plasmid DNA by conventional 
methods (Stratagene) and the cDNA termed 4n-nl, was isolated and sequenced 55Ibp and 
541bp from the T7 and T3 end, respectively. 4n-nl is 2.2kb in length and the T7 end showed 
72% identity, at the nucleotide level, to position 1486 to 1715 of the HIPl cDNA. The 2.2kb 
5 insert from 4n-nl was excised using EcoRl . Fifty nanograms of the 2.2kb insert was used to 
produced a radioactive probe and used to screen the moiise brain lambda ZAPII cDNA library 
(Stratagene U 93609) in the same manner as above. The resulting positive phage was 
converted into plasmid DNA by conventional methods (Stratagene) and the cDNA termed 
mHIPla, was isolated and completely sequenced. mHIPl is 2.3kb in length and showed 85% 
10 identity, at the nucleotide level, to position 726 to 3072 of the HIPl cDNA. 

mHTPla : 

Fifty nanograms of a 1.3kb EcoRl & Ncol fragment of clone 404331 (containing EST 
w82687) was radioactively labeled with [32-P]-dCTP using random—priming. The probe was 

15 allowed to hybridize to filters containing > 2x 10' pfii/ml of the mouse brain lambda ZAPII 
cDNA library (Stratagene # 93609) overnight at 65 °C in Church buffer (see above). The 
filters were washed at room temperature for 15 minutes with 2XSSPE, 0.1% SDS, then at 
65 "C for 20 minutes with IXSSPE, 0,1%SDS and finally twice at 65 °C with 0.2XSSPE, 
0.1%SDS. The filters were exposed to X-ray film (Kodak, XAR5) ovemight at -70°C. 

20 Primary positives were isolated, replated and subsequent secondary positives were 

hybridized and washed as for the primary screen. The resulting positive phage was converted 
into plasmid DNA by conventional methods (Stratagene) and the cDNA termed mHIPla, was 
isolated and completely sequenced. mHIPla is 3.96 kb in length and shows 60% identity, at 
the nucleotide level, to position 12 to 2703 of the HIPl cDNA. 

25 

EXAMPLE 12 

HIPla: 

The entire mHIPla cDNA sequence was used to screen the non-redundant Database 
of GenBank EST Division. We identified a human EST, T08283, which showed homology to 
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mHIPla. T08383 (clone HIBBB80) is 391bp and shows 87% identity, at the nucleotide level, 
to position 2904 to 3113 of the mHIPla cDNA. 

Fifty nanograms of a 1.6kb HindlHI & Not II fragment of clone 404331 (containing 
EST T08283) was radioactively labeled with [32-P]-dCTP using random-priming. The probe 
5 was allowed to hybridize to filters containing > 2x 1 05 pfu/ml of a human frontal cortex 
lambda cDNA library overnight at 65 C in Church buffer (see above). The filters were 
washed at 65 C for 10 minutes with IXSSPE, 0.1% SDS, and then for 30 minutes and 15 
minutes with 0. IXSSPE, 0.1%SDS. The filters were exposed to X-ray film (Kodak, XAR5) 
overnight at -70 C. Primary positives were isolated, replated and subsequent secondary 
10 positives were hybridized and washed as for the primary screen. The resulting positive phage 
was converted into plasmid DNA by conventional methods (Stratagene) and the cDNA 
termed HIP la, was isolated and completely sequenced. HIP la is 3.2 kb in length and shows 
53% identity, at the nucleotide level, to position 876 to 3058 of the HIPl cDNA. 



15 EXAMPLE 13 

Following the identification of a 1 .2 kb partial human HIP-1 cDNA by yeast 
two-hybrid interaction studies, a 3.9 kb HIP-1 fi-agment was isolated from a cDNA library, 
ligated to a 5' RACE product then subcloned into the mammalian expression vector pCI-neo 
(Promega). This construct, CMV-HIP-1, expresses HIP-1 from the CMV promoter and was 

20 used in the cell expression studies described below. Mouse HIP- 1 a (mHIP- 1 a) was also 
subcloned into a CMV driven expression vector for cell culture expression studies. 



EXAMPLE 14 

Huntingtin proteins with expanded polyglutamine tracts can aggregate into large, 
25 irregularly shaped deposits in HD brains, transgenic mice and in vitro cell culture. We have 
shown that in HEK (human embryonic kidney) 293T cells the aggregation of full-length and 
larger huntingtin fragments occurs after the cells have been exposed to a period of apoptotic 
stress. In order to assess the consequence of HIP- 1 expression in cultured cells, we used 
huntingtin aggregation as one marker of viability. 
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Human embryonic kidney cells (HEK 293T) were grown on glass coverslips in 
Dulbecco's modified Eagle medium (DMEM, Gibco, NY) with 10% fetal bovine serum and 
antibiotics, in 5% C02 at 37°C. The cells were transfected at 30% confluency with the 
calcium phosphate protocol by mixing Qiagen-prepared DNA (Qiagen, CA) with 2.5 M 
5 CaCljy then incubating at room temperature for 10 min. 2X HEPES buffer (240 mM NaCl, 
3.0 mM NaiHPO^, 100 mM HEPES, pH 7.05) was added to the DNA/calcium mixture, 
incubated at 37^C for 60 sec, then added to the cells. After 12-18 h, the media was removed, 
the cells were washed and fresh media was added. At 36 h post-transfection, the cells were 
exposed to an apoptotic stress by treatment with 35 uM tamoxifen (Sigma) for 1 hour, or left 

10 untreated, then processed for immunofluorescence. The cells were washed with PBS, fixed in 
4% paraformaldehyde/PBS solution for 20 minutes at room temperature then permeabilized 
in 0.5% Triton X-IOO/PBS for 5 min. Following three PBS washes, the cells were incubated 
with anti-huntingtin antibody MAB2166 (Chemicon) (1:2500 dilution) and anti-HIP-1 
antibody HlP-lfp (1:100 dilution) in 0.4% BS A/PBS for 1 h at room temperature in a 

15 humidified container. The primary antibody was removed, the cells were washed and 

secondary antibodies conjugated to Texas red or FITC were added at a 1 :600-l :800 dilution 
for 30 min at room temperature. The cells were then washed again, and the coverslips were 
mounted onto slides with DAPI (4*,6'-diamindino-2 phenylindole, Sigma) as a nuclear 
counter-stain. Immunofluorescence was viewed using a Zeiss (Axioscope) microscope, 

20 digitally captured with a CCD camera (Princeton Instrument Inc.) and the images were 
colourized and overlapped using the Eclipse (Empix Imaging Inc.) software program. 
Appropriate control experiments were performed to determine the specificity of the 
antibodies, including secondary antibody only and mock transfected cells. 

The huntingtin fragment HD1955 was used in the aggregation studies. This fragment 

25 represents the N-terminal 548 amino acids of huntingtin, and corresponds approximately to 
the polyglutamine-containing fragment produced by caspase 3 cleavage of huntingtin. 
Transfection of HD1955 with 15 polyglutamines (HD 1955-1 5) results in a diffuse 
cytoplasmic distribution of the expressed protein. Transfection of HD1955 with 128 
polyglutamines (HD1955-128) also results in diffuse cytoplasmic expression. However, 

30 exposure of cells transfected with HD1955-128 to tamoxifen results in a marked 
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redistribution of huntingtin. In 29% of cells expressing HDl 955-128, the huntingtin protein 
appears as dense aggregates that are localized in the perinuclear area of the cell, hi contrast, 
less than 1% of HDl 955- 128 expressing cells contain aggregates in the absence of tamoxifen, 
and 0% of HDl 955-1 5 cells form aggregates in the presence or absence of tamoxifen 
5 treatment. 

Co-transfcction of HIP- 1 and HDl 95 5 was used to test the influence of HIP- 1 on 
huntingtin aggregation. As a control, b-galactosidase was co-transfected with HD1955. In the 
control transfections, 1-2% of cells expressing HD 1955-1 28 formed aggregates in the absence 
of tamoxifen, similar to HDl 955-1 28 expressed alone. However, when HDl 955-1 28 was 

10 co-expressed with HIP-1, an average of 14% of huntingtin-expressing cells contained 

aggregates with no tamoxifen treatment. Double-labeling demonstrated that the majority of 
the cells containing aggregates also expressed HIP-1, directly implicating HIP-1 in the 
increase in aggregation. Therefore, these results indicate that HIP-1 provides sufficient stress 
on the huntingtin-expressing cells to form aggregates, to the extent that tamoxifen is no 

15 longer necessary. 

EXAMPLE 15 

We next designed a series of experiments to identify a region of HIP-1 sufficient for 
inducing aggregate formation of HDl 95 5- 128. As described above, HIP-1 contains a domain 

20 with high homology to the death effector domains (DED) present in many apoptosis-related 
proteins. The DED domain of HIP-1 was ligated in-frame to HD1955-128, 3' from the 
caspase-3 cleavage site. Transfection of the resulting fusion protein with the DED ligated in 
the sense orientation (HD1955-128-DEDsense) resulted in a large number (30-50%) of cells 
containing aggregates, without tamoxifen incubation. In contrast, expression of a 

25 huntingtin-DED fusion protein with DED in the antisense orientation 

(HD1955-128-DEDantisense) did not have more aggregates than the HD1955-128 no 
tamoxifen control. Therefore, the DED domain of HIP- 1 is sufficient to stress the cells, 
causing aggregate formation. 
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EXAMPLE 16 

To directly assess the effect of HIP- 1 expression on cell viability, mitochondrial 
function tests were performed on 293T cells transfected with HIP-l . The assessment of 
mitochondrial function, using the MTT assay (Carmichael et al.. Cancer Res. 47: 936-942 
5 (1987); Vistica et a!.. Cancer Res. 51: 2515-2520 (1991)), is a standard method to measure 
cell viabiUty. The MTT assay quantitates the formation of a coloured substrate (formazan), 
with the mitochondria of viable cells forming more substrate than non-viable cells. Since 
decreased mitochondrial activity is an early consequence of many cellular toxins, the MTT 
assay provides an early indicator of cell damage. 

10 For cell viabiHty assays, HEK 293T cells were seeded at a density of 5 x 10* cells into 

96-well plates and transfected with 0.1 ug or 0.08 ug HIP-1 or 0.1 ug of the control construct 
lacZ using the calcium phosphate method described above. At 24-36 hours post-transfection 
tamoxi fen-treated cells were incubated for 2 hours in a 1 :10 dilution of WST-1 reagent 
(Boehringer Mannheim) and release of formazan from mitochondria was quantified at 450 

15 nm using an ELISA plate reader (Dynatech Laboratories) (Carmichael et al., 1987; Mosmann, 
J, Immunol Meth 65: 55-63 (1983)). One way ANOVA and Newman-Keuls test were used 
for statistical analysis. The transfection efficiency, measured by p-galactosidase staining and 
immune fluoresence, was approximately 50%. 

We have previously demonstrated that expression of mutant huntingtin results in 

20 increased susceptibility to an apoptotic stress induced by sub-lethal doses of tamoxifen in 
transfected 293T cells (Martindale et al., 1998). A similar assay was used to test the 
consequence of HIP-1 expression. With 0.1 ug transfected HIP-1 DNA, after 24 hr 
expression, HTP-1 resulted in increased cell death in response to tamoxifen, compared with 
the tamoxifen-treated P-galactosidase control (p<0.01, n=4). Reducing the amount of 

25 transfected HIP-1 DNA to 0.08 ug also resulted in increased cell death compared with control 
(p<0.01, n=4), indicating the high potency of HIP-1 (Fig. 8). Furthermore, increased cell 
death in cells transfected with HEP-l was observed in the absence of apoptotic stress at 48 hrs 
post-transfection, but was so severe that is could not be accurately quantitated. Thus, an 
earlier time point (24 hr) had to be used for better reproducibility, using an apoptotic stress to 

30 unmask the phenotype. 



wo 99/60986 



-31 - 



PCT/US99/11743 



In order to model a pathogenic interaction of HIP- 1 and huntingtin in the HEK 293 
mammalian cell system, HIP- 1 was transfected into cell lines stably expressing huntingtin. 
Two cell lines were chosen for the initial studies, one line expressed the truncated HD1955 
construct with 15 glutamines, and the second expressed the HD1955 with 128 repeats. 
Western blotting indicated that the cell lines expressed huntingtin at similar levels. To assess 
whether HIP-1 is toxic in the presence of mutant huntingtin, 0.1 ug HIP-1 DNA was 
transfected into the HD1955-128 cell line. Transfection of HIP-1 into the HD1955-15 cell 
line was used as the wild-type huntingtin control, and transfection of LacZ into both cell lines 
was the control for transfection-related toxicity (Figs 9A and 9B). MTT toxicity assays 
showed that HIP-1 in the presence of mutant huntingtin (HD 1955- 128) was significantly 
more toxic than HIP-1 with wild-type huntingtin (HD1955-15), p<0.001, n=4 (Fig. 9C). This 
toxicity was observed at 24 hr and 36 hr post-transfection. No tamoxifen was needed to 
unmask the phenotype, suggesting that the combined cell stress of HIP-1 with truncated 
huntingtin was sufficient to reduce cell viability over control. 
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CLAIMS 



1 1 . A polypeptide comprising the sequence given by Seq. ID. No. 5. 

1 2. A cDNA molecule comprising the sequence given by Seq. ID No. 6. 

1 3. A polypeptide comprising the sequence given by Seq. ID No. 7. 

1 4. A method for ameliorating the effects ofHuntington's disease in a 

2 patient expressing a HIP-apoptosis modulating protein, comprising the step of administering 

3 the patient a therapeutic composition which reduces the activity of the HIP-apoptosis 

4 modulating protein. 

1 5. A method according to claim 4, wherein the composition comprises a 

2 material which binds to HIP-apoptosis modulating protein. 

1 6. The method according to claim 4, wherein the composition comprises 

2 an expression vector encoding huntingtin having a normal number of repeats. 

1 7. An expression vector for expression of a gene in a mammalian host 

2 comprising a region encoding an HD-interacting polypeptide. 

1 8. The expression vector according to claim 7, wherein the HD- 

2 interacting polypeptide is an HlP-^optosis modulating protein. 

1 9. The expression vector according to claim 8, wherein the HIP-apoptosis 

2 modulating protein has a sequence which includes the amino acid sequences given by SEQ 

3 ID Nos. 2, 4, 5 or 7. 
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1 10. The expression vector of claim 7, wherein the HD-interacting 

2 polypeptide interacts differently with expanded Huntingtin than with Huntingtin having a 

3 CAG repeat region containing 15 to 35 repeats. 

1 11. The expression vector according to claims of claims 7- 1 0, further 

2 comprising a region encoding Huntingtin having a polyglutamine tract of 35 or fewer. 

1 12. A method for inducing apoptotic death in cells, comprising the step of 

2 introducing into the cells an expression vector encoding at least the death effector domain of 

3 a HIP-apoptosis modulating protein whereby the death effector domain is expressed by the 

4 cells. 

1 13. The method of claim 12, wherein the expression vector encodes the 

2 amino acid sequence given by Seq. ID. No. 2. 

1 14. The method of claim 12, wherein the expression vector encodes the 

2 amino acid sequence given by Seq. ID. No. 4. 

1 15. A method for screening a composition for the ability to inhibit 

2 apoptosis induced by an HIP-apoptosis modulating protein, comprising simultaneously 

3 exposing a population of cells to the composition and an HIP-apoptosis modulating protein 

4 and measuring the extent of cell death. 
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SAEVIHQVEEALDTDEKKHLLFLCRDVAIDWPPNVaOLLDlLRERGKLSVCDLAELLYRVKRFDLLKRILK 



>03uipin A 



>U3urpin: B 

yRVLMAHIGEDLDKSDVSSLIFLMKDYMGRGKISKHKSFLDLVVELHKLNLVAPDQLDLLEKCLKNIlIRIDI.KTKIQK 
>Ca5p-8 K 

FSRNLYDrGEL0DSEDLASLKELSLDYI?QRKOEPIKDALMIFQRLOEKRI-<LEESNLSFLKELLFRINRLDLLTTYLN 
>Ca3p-8 B 

YRVMLYQISEEVSREELRSFKFLLQMEISKCKLDDDMNLLDiriEMErCRVILGEGKLDILKRVCAQINKSLLKIND 
>Ca3p-10 A 

FRHfaLTIDSNLGVQDVENLKFLCIGLVPNKKLEKSSGASDVFEJlLLAnDLLSEEDPFFLAELLYIIRQKiaLQHLNC 
>Ca3D-10i B 

FRNllYELSEGIDSENLKDMrFLLECDSLPKTEMTSLSFLAFLEKQGiaDEDHLTCLEDLCKTWPKLLRNIEK 
>FADD 

rLVLLHSVSS5LSSSELTELKrLCLGRVGKRKLERVQSGLDLFSMLLEQHDLEPGHTELI.REI.LASLRRHDLLRRVDD 
^MC159 A 

SLPFLRilLLEELDSKEDSLLLFLGHDAAPGCTTVTQALCSLSQQRKLTLAALVEMLYVLQRMDLLKSRFG 
>MC159 B 

YHKLMVCVGEELDSSELRALRLFACNLNPSLSTALSESSRP'/ELVLALENVGLVSPSSVSVLADf-ILRTLRRLDLCQQLVE 
>E8 

FRCLMALVNDFLSDKEVEEHYFLCAPRLESHLEPGSKKSrLRLASLLEDLELLGGDKLTFLRKLLTTIGRADLVKNLQV 
>K3 orfkl3A 

TYEVLCEVARKLGTDDREVVLFLLUVFLPQPTLAQLIGALRALKEEGRLTFPLLAECLPRAGRRDLLRDLL.H 

>KS orf)cl3B ^ 
YQLTVLHVDGELCARDrRSLIFLSKDTIGSRSTPQTFLHNVyCMENLDLLGPTDVDALMS^^LRSLSRVDr>QRQVQT 

SELEADIJ\EQOHLRQQAADDCEFLRA£LDELRRQREDTEKAQRSLSElERKAQANEQRYSKLKEKySELVQN 
AE 

>HIPla 

GELEEQRKQKQKALVDNEQLRHELAQLRAAQLERERSQGLREEAERKASATEARYNKLKEKHSELVnVirAELLRKNAD 

NGLEAELEEORKQKQKALVDNEOLRHEIJVQLKALOLEGARWQGLREEAERKASATEARySKLKEKH 
AD 

>niHIPl 

SELEAEU^EQQHLGRQAMDDCEFLRTELDELKROREDTEKAQRSLTEIERKAQANEQRYSKLKEKYSELVQNHADLLRKN 
AE 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPUCANT:Kalchman, Michael 
Haydcn, Michael R. 

Hackam, Abigail 
Chopra, Vikramjit Singh 
Nicholson, Donald W. 
Vallaincourt, John P. 
Rasper, Dita M. 

(ii) TITLE OF INVENTION: Apoptosis Modulators That Interact with the 
Huntington's Disease Gene 

(iii) NUMBER OF SEQUENCES; 44 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Oppedahl & Larson 

(B) STREET: PO Box 5270 

(C) CITY: Frisco 

(D) STATE: CO 

(E) COUNTRY: USA 

(F) ZIP: 80443-5270 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.44 Kb storage 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: MS DOS 5.0 

(D) SOFTWARE: WordPerfect 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIHCATION: 

(vhi) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Larson, Marina T. 

(B) REGISTRATION NUMBER: 32038 

(C) REFERENCE/DOCKET NUMBER: UBC.P-013US2 
(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (970) 668-2050 

(B) TELEFAX: (970) 668-2052 

(2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTIGN: SEQ ID N0:1: 

ACAGCTGACA CCCTGCAAGG CCACCGGGAC CGCTTCATGG AGCAGTTTAC 50 
AAAGTTGAAA GATCTGTTCT ACCGCTCCAG CAACCTGCAG TACTTCAAGC 100 
GGGTCATTCA GATCCCCCAG CTGCCTGAGA ACCCACCCAA CTTCCTGCGA 150 
GCCTCAGCCC TGTCAGAACA TATCAGCCCT GTGGTGGTGA TCCCTGCAGA 200 
GGCCTCATCC CCCGACAGCG AGCCAGTCCT AGAGAAGGAT GACCTCATGG 250 
ACATGGATGC CTCTCAGCAG AATTTATTTG ACAACAAGTT TGATGACNTC 300 
TTTGGCAGTT CATCCAGCAG TGATCCCTTC AATTTCAACA GTCAAAATGG 350 
TGTGAACAAG GATGAGAAGG ACCACTTAAT TGAGCGACTA TACAGAGAGA 400 
TCAGTGGATT GAAGGCACAG CTAGAAAACA TGAAGACTGA GAGCCAGCGG 450 
GTTGTGCTGC AGCTGAAGGG CCACGTCAGC GAGCTGGAAG CAGATCTGGC 500 
CGAGCAGCAG CACCTGCGGC AGCAGGCGGC CGACGACTGT GAATTCCTGC 550 
GGGCAGAACT GGACGAGCTC AGGNGGCAGC GGGAGGACAC CGAGAAGGCT 6 00 
CAGCGGAGCC TGTCTGAGAT AGAAAGGAAA GCTCAAGCCA ATGAACAGCG 650 
ATATAGCAAG CTAAAGGAGA AGTACAGCGA GCTGGTTCAG AACCACGCTG 700 
ACCTGCTGCG GAAGAATGCA GAGGTGACCA AACAGGTGTC CATGGCCAGA 750 
CAAGCCCAGG TAGATTTGGA ACGAGAGAAA AAAGAGCTGG AGGATTCGTT 800 
GGAGCGCATC AGTGACCAGG GCCAGCGGAA GACTCAAGAA GAGCTGGAAG 850 
TTCTAGAGAG CTTGAAGCAG GAACTTGGCA CAAGCCAACG GGAGCTTCAG 900 
GTTCTGCAAG GCAGCCTGGA AACTTCTGCC CAGTCAGAAG CAAACTGGGC 950 
AGCCGAGTTC GCCGAGCTAG AGAAGGAGCG GGACAGCCTG GTGAGTGGCG 1000 
CAGCTCATAG GGAGGAGGAA TTATCTGCTC TTCGGAAAGA ACTGCAGGAC 1050 
ACTCAGCTCA AACTGGCCAG CACAGAGGAA TCTATGTGCC AGCTTGCCAA 1100 
AGACCAACGA AAAATGCTTC TGGTGGGGTC CAGGAAGGCT GCGGAGCAGG 1150 
TGATACAAGA CGCG 1164 

(2) INFORMATION FOR SEQ ED NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 

(B) TYPE: protein 

<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Thr Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe Met Glu Gin 
15 10 15 

Phe Thr Lys Leu Lys Asp Leu Phe Tyr Arg Ser Ser Asn Leu Gin 
20 25 30 

Tyr Phe Lys Arg Val He Gin He Pro Gin Leu Pro Glu Asn Pro 
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35 40 45 

Pro Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His lie Ser Pro 

50 55 60 

Val Val Val lie Pro Ala Glu Ala Ser Ser Pro Asp Ser Glu Pro 

65 70 75 

Val Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala Ser Gin Gin 

80 85 90 

Asn Leu Phe Asp Asn Lys Phe Asp Asp Phe Gly Ser Ser Ser Ser 

95 100 105 

Ser Asp Pro Phe Asn Phe Asn Ser Gin Asn Gly Val Asn Lys Asp 

110 115 120 

Glu Lys Asp His Leu lie Glu Arg Leu Tyr Arg Glu lie Ser Gly 

125 130 135 

Leu Lys Ala Gin Leu Glu Asn Met Lys Thr Glu Ser Gin Arg Val 

140 145 150 

Val Leu Gin Leu Lys Gly His Val Ser Glu Leu Glu Ala Asp Leu 

155 160 165 

Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala Asp Asp Cys Glu 

170 175 180 

Phe Leu Arg Ala Glu Leu Asp Glu Leu Arg Gin Arg Glu Asp Thr 

185 190 195 

Glu Lys Ala Gin Arg Ser Leu Ser Glu lie Glu Arg Lys Ala Gin 

200 205 210 

Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys Glu Lys Tyr Ser Glu 

215 220 225 

Leu Val Gin Asn His Ala Asp Leu Leu Arg Lys Asn Ala Glu Val 

230 235 240 

Thr Lys Gin Val Ser Met Ala Arg Gin Ala Gin Val Asp Leu Glu 

245 250 255 

Arg Glu Lys Lys Glu Leu Glu Asp Ser Leu Glu Arg lie Ser Asp 

260 265 270 

Gin Gly Gin Arg Lys Thr Gin Glu Gin Leu Glu Val Leu Glu Ser 

275 280 285 

Leu Lys Gin Glu Leu Gly Thr Ser Gin Arg Glu Leu Gin Val Leu 
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300 



Gin Gly Ser Leu Glu Thr Ser Ala 
305 

Ala Glu Phe Ala Glu Leu Glu Lys 
320 

Gly Ala Ala His Arg Glu Glu Glu 
335 

Leu Gin Asp Thr Gin Leu Lys Leu 
350 

Cys Gin Leu Ala Lys Asp Gin Arg 
365 

Arg Lys Ala Ala Glu Gin Val lie 
380 



(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4796 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CAGTGTACGG TTGATCATAT AACGCCGCGG GCGGGGATTG GTTTATATAT 50 
CGCAAATTGA TNTAGGGGGG GGGGGATGGN CAGAGATTTC GCTTCATTAG 100 
GCCATTATAA GCAGGAAGGG TTTCAAGGAA AAAAACCCAG AAAGTGCATA 150 
TTGCACCCAC CATGAGAAAG GGGCAACAGA CCTTNTGTTN TGTTNTCAAC 200 
CGCCTGCTTC TGTTTTAGCA ACGCAGTGTT TTGGTGGAAG TTGTGCCATG 250 
TGTTCCACAA ANTCTTCCGA GATGGACACC CGAACGTCCT GT^GGACTTT 300 
GTGAGATACA GAAATGAATT GAGTGACATG AGCAGGATGT GGGGCCACCT 350 
GAGCGAGGGG TATGGCCAGC TGTGCAGCAT CTACCTGAAA CTGCTAAGAA 400 
CCAAGATGGA GTACCACACC AAAAATCCCA GGTTCCCAGG CAACCTGCAG 450 
ATGAGTGACC GCCAGCTGGA CGAGGCTGGA GAAAGTGACG TGAACAACTT 500 
TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAGTGT GAACTCAACC 5 50 
TCTTCCAAAC AGTATTCAAC TCCCTGGACA TGTCCCGCTC TGTGTCCGTG 600 
ACGGCAGCAG GGCAGTGCCG CCTCGCCCCG CTGATCCAGG TCATCTTGGA 650 
CTGCAGCCAC CTTTATGACT ACACTGTCAA GCTTCTCTTC T^CTCCACT 700 
CCTGCCTCCC AGCTGACACC CTGCAAGGCC ACCGGGACCG CTTCATGGAG 750 



Gin Ser Glu Ala Asn Trp Ala 
310 315 

Glu Arg Asp Ser Leu Val Ser 
325 330 

Leu Ser Ala Leu Arg Lys Glu 
340 345 

Ala Ser Thr Glu Glu Ser Met 
355 360 



Lys Met Leu Leu Val Gly Ser 



370 



375 



Gin Asp Ala 
385 386 
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CAGTTTACAA AGTTGAAAGA 
CTTCAAGCGG CTCATTCAGA 
TCCTGCGAGC CTCAGCCCTG 
CCTGCAGAGG CCTCATCCCC 
CCTCATGGAC ATGGATGCCT 
ATGACATCTT TGGCAGTTCA 
CAAAATGGTG TGAACAAGGA 
CAGAGAGATC AGTGGATTGA 
GCCAGCGGGT TGTGCTGCAG 
GATCTGGCCG AGCAGCAGCA 
ATTCCTGCGG GCAGAACTGG 
AGAAGGCTCA GCGGAGCCTG 
GAACAGCGAT ATAGCAAGCT 
CCACGCTGAC CTGCTGCGGA 
TGGCCAGACA AGCCCAGGTA 
GATTCGTTGG AGCGCATCAG 
GCTGGAAGTT CTAGAGAGCT 
AGCTTCAGGT TCTGCAAGGC 
AACTGGGCAG CCGAGTTCGC 
GAGTGGCGCA GCTCATAGGG 
TGCAGGACAC TCAGCTCAAA 
CTTGCCAAAG ACCAACGAAA 
GGAGCAGGTG ATACAAGACG 
TCAGCTGCGC TGGGTCTGCA 
TCCAGCTGCA TCGAGCAACT 
CCCAGAAGAC ATCAGTGGAC 
TGACCAGCGA CGCCATTGCT 
CCTGAGCCTG CCGACTCACT 
AACCCTCGCC TACCTGGCCT 
CCGACAGCAC AGCCATGAGG 
GAGGAGCTCC TGCCCAGGGG 
CCTGGTGGAC AAGGAGATGG 
CGGCCAGAAT AGAGGAGATG 
GTCAAATTGG AGGTGAATGA 
GCAAGCTATT CAGGTGCTCA 
TTGTGGAGAG CGGCAGGGGT 
AACTCTCGAT GGACAGAAGG 
GGGAGCCACT GTCATGGTGG 
GGAAATTTGA GGAGCTAATG 
GCCCAGCTTG TGGCTGCATC 
CCTAGCCCAG CTGCAGCAGG 
GCGTTGTGGC CTCAACCATT 
AACATGGACT TCTCAAGCAT 
GGATTCTCAG GTTAGGGTGC 
GTCAAAAACT GGGAGAGCTT 
GCTGAGGGCT GGGAAGAAGG 
AGTGGTAACC GAAAAAGAAT 
TAAATCCTTG TTACCTATCT 
AATCCTTGGA GTCCCAGGGG 
AGGACATGCA TGACACTTCC 
GTTTGGACCC ATGGTCATCT 
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TCTGTTCTAC CGCTCCAGCA ACCTGCAGTA 800 

TCCCCCAGCT GCCTGAGAAC CCACCCAACT 850 

TCAGAACATA TCAGCCCTGT GGTGGTGATC 900 

CGACAGCGAG CCAGTCCTAG AGAAGGATGA 950 

CTCAGCAGAA TTTATTTGAC AACAAGTTTG 1000 

TTCAGCAGTG ATCCCTTCAA TTTCAACAGT 1050 

TGAGAAGGAC CACTTAATTG AGCGACTATA 1100 

AGGCACAGCT AGAAAACATG AAGACTGAGA 1150 

CTGAAGGGCC ACGTCAGCGA GCTGGAAGCA 1200 

CCTGCGGCAG CAGGCGGCCG ACGACTGTGA 12 50 

ACGAGCTCAG GAGGCAGCGG GAGGACACCG 1300 

TCTGAGATAG AAAGGAAAGC TCAAGCCAAT 1350 

AAAGGAGAAG TACAGCGAGC TGGTTCAGAA 1400 

AGAATGCAGA GGTGACCAAA CAGGTGTCCA 1450 

GATTTGGAAC GAGAGAAAAA AGAGCTGGAG 1500 

TGACCAGGGC CAGCGGAAGA CTCAAGAACA 1550 

TGAAGCAGGA ACTTGGCACA AGCCAACGGG 1600 

AGCCTGGAAA CTTCTGCCCA GTCAGAAGCA 1650 

CGAGCTAGAG AAGGAGCGGG ACAGCCTGGT 1700 

AGGAGGAATT ATCTGCTCTT CGGAAAGAAC 1750 

CTGGCCAGCA CAGAGGAATC TATGTGCCAG 1800 

AATGCTTCTG GTGGGGTCCA GG AAGGCTGC 1850 

CCCTGAACCA GCTTGAAGAA CCTCCTCTCA 1900 

GATCACCTCC TCTCCACGGT CACATCCATT 1950 

GGAGAAAAGC TGGAGCCAGT ATCTGGCCTG 2000 

TTCTCCATTC CATAACCCTG CTGGCCCACT 2050 

CATGGTGCCA CCACCTGCCT CAGAGCCCCA 2100 

GACCGAGGCC TGTAAGCAGT ATGGCAGGGA 2150 

CCCTGGAGGA AGAGGGAAGC CTTGAGAATG 2200 

AACTGCCTGA GCAAGATCAA GGCCATCGGC 2250 

ACTGGACATC AAGCAGGAGG AGCTGGGGGA 2300 

CGGCCACTTC AGCTGCTATT GAAACTTGCA 2350 

CTCAGCAAAT CCCGAGCAGG AGACACAGGA 2400 

AAGGATCCTT CGTTGCTGTA CCAGCCTCAT 2450 

TCGTGGCCTC TAAGGACCTC CAGAGAGAGA 2500 

ACAGCATCCC CTAAAGAGTT TTATGCCAAG 2550 

ACTTATCTCA GCCTCCAAGG CTGTGGGCTG 2600 

ATGCAGCTGA TCTGGTGGTA CAAGGCAGAG 2650 

GTGTGTTCTC ATGAAATTGC TGCTAGCACA 27 00 

CAAGGTGAAA GCTGATAAGG ACAGCCCCAA 2750 

CCTCTCGGGG AGTGAACCAG GCCACTGCCG 2800 

TCCGGCAAAT CACAGATCGA AGAGACAGAC 2850 

GACGCTGACA CAGATCAAAC GCCAAGAGAT 2900 

TAGAGCTAGA AAATGAATTG CAGAAGGAGC 2950 

CGGAAAAAGC ACTACGAGCT TGCTGGTGTT 3000 

AACAGAGGCA TCTCCACCTA CACTGCAAGA 3 050 

AGAGCCAAAC CAACACCCCA TATGTCAGTG 3100 

CGTGTGTGTT ATTTCCCCAG CCACAGGCCA 3150 

CAGCCACACC ACTGCCATTA CCCAGTGCCG 3200 

CAAAGATCCC TCCATAGCGA CACCCTTTCT 3250 

CTGTTCTTTT CCCGCCTCCC TAGTTAGCAT 3300 
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CCAGGCTGGC CAGTGCTGCC CATGAGCAAG CCTAGGTACG AAGAGGGGTG 3350 

GTGGGGGGCA GGGCCACTCA ACAGAGAGGA CCAACATCCA GTCCTGCTGA 3400 

CTATTTGACC CCCACAACAA TGGGTATCCT TAATAGAGGA GCTGCTTGTT 345 0 

GTTTGTTGAC AGCTTGGAAA GGGAAGATCT TATGCCTTTT CTTTTCTGTT 3500 

TTCTTCTCAG TCTTTTCAGT TTCATCATTT GCACAAACTT GTGAGCATCA 3550 

GAGGGCTGAT GGATTCCAAA CCAGGACACT ACCCTGAGAT CTGCACAGTC 3 600 

AGAAGGACGG CAGGAGTGTC CTGGCTGTGA ATGCCAAAGC CATTCTCCCC 3650 

CTCTTTGGGC AGTGCCATGG ATTTCCACTG CTTCTTATGG TGGTTGGTTG 3700 

GGTTTTTTGG TTTTGTTTTT TTTTTTTAAG TTTCACTCAC ATAGCCAACT 3750 

CTCCCAAAGG GCACACCCCT GGGGCTGAGT CTCCAGGGCC CCCCAACTGT 3800 

GGTAGCTCCA GCGATGGTGC TGCCCAGGCC TCTCGGTGCT CCATCTCCGC 3850 

CTCCACACTG ACCAAGTGCT GGCCCACCCA GTCCATGCTC CAGGGTCAGG 3900 

CGGAGCTGCT GAGTGACAGC TTTCCTCAAA AAGCAGAAGG AGAGTGAGTG 3950 

CCTTTCCCTC CTAAAGCTGA ATCCCGGCGG AAAGCCTCTG TCCGCCTTTA 4000 

CAAGGGAGAA GACAACAGAA AGAGGGACAA GAGGGTTCAC ACAGCCCAGT 4050 

TCCCGTGACG AGGCTCAAAA ACTTGATCAC ATGCTTGAAT GGAGCTGGTG 4100 

AGATCAACAA CACTACTTCC CTGCCGGAAT GAACTGTCCG TGAATGGTCT 4150 

CTGTCAAGCG GGCCGTCTCC CTTGGCCCAG AGACGGAGTG TGGGAGTGAT 42 00 

TCCCAACTCC TTTCTGCAGA CGTCTGCCTT GGCATCCTCT TGAATAGGAA 42 50 

GATCGTTCCA CTTTCTACGC AATTGACAAA CCCGGAAGAT CAGATGCAAT 4300 

TGCTCCCATC AGGGAAGAAC CCTATACTTG GTTTGCTACC CTTAGTATTT 4350 

ATTACTAACC TCCCTTAAGC AGCAACAGCC TACAAAGAGA TGCTTGGAGC 4400 

AATCAGAACT TCAGGTGTGA CTCTAGCAAA GCTCATCTTT CTGCCCGGCT 4450 

ACATCAGCCT TCAAGAATCA GAAGAAAGCC AAGGTGCTGG ACTGTTACTG 4500 

ACTTGGATCC CAAAGCAAGG AGATCATTTG GAGCTCTTGG GTCAGAGAAA 4550 

ATGAGAAAGG ACAGAGCCAG CGGCTCCAAC TCCTTTCAGC CACATGCCCC 4600 

AGGCTCTCGC TGCCCTGTGG ACAGGATGAG GACAGAGGGC ACATGAACAG 4650 

CTTGCCAGGG ATGGGCAGCC CAACAGCACT TTTCCTCTTC TAGATGGACC 4700 

CCAGCATTTA AGTGACCTTC TGATCTTGGG AAAACAGCGT CTTCCTTCTT 4750 

TATCTATAGC AACTCATTGG TGGTAGCCAT CAAGCACTTC GGAATT 4796 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-inleracting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Arg Met Trp Gly His Leu Ser Glu Gly Tyr Gly Gin Leu 

15 10 15 

Cys Ser lie Tyr Leu Lys Leu Leu Arg Thr Lys Met Glu Tyr His 
20 25 30 
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Thr Lys Asn Pro Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg 

35 40 45 

Gin Leu Asp Glu Ala Gly Glu Ser Asp Val Asn Asn Phe Phe Gin 

50 55 60 

Leu Thr Val Glu Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu 

65 70 75 

Phe Gin Thr Val Phe Asn Ser Leu Asp Met Ser Arg Ser Val Ser 

80 85 90 

Val Thr Ala Ala Gly Gin Cys Arg Leu Ala Pro Leu lie Gin Val 

95 100 105 

lie Leu Asp Cys Ser His Leu Tyr Asp Tyr Thr Val Lys Leu Leu 

110 115 120 

Phe Lys Leu His Ser Cys Leu Pro Ala Asp Thr Leu Gin Gly His 

125 130 135 

Arg Asp Arg Phe Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe 

140 145 150 

Tyr Arg Ser Ser Asn Leu Gin Tyr Phe Lys Arg Leu lie Gin lie 

155 160 165 

Pro Gin Leu Pro Glu Asn Pro Pro Asn Phe Leu Arg Ala Ser Ala 

170 175 180 

Leu Ser Glu His He Ser Pro Val Val Val He Pro Ala Glu Ala 

185 190 195 

Ser Ser Pro Asp Ser Glu Pro Val Leu Glu Lys Asp Asp Leu Met 

200 205 210 

Asp Met Asp Ala Ser Gin Gin Asn Leu Phe Asp Asn Lys Phe Asp 

215 220 225 

Asp He Phe Gly Ser Ser Phe Ser Ser Asp Pro Phe Asn Phe Asn 

230 235 240 

Ser Gin Asn Gly Val Asn Lys Asp Glu Lys Asp His Leu He Glu 

245 250 255 

Arg Leu Tyr Arg Glu He Ser Gly Leu Lys Ala Gin Leu Glu Asn 

260 265 270 

Met Lys Thr Glu Ser Gin Arg Val Val Leu Gin Leu Lys Gly His 

275 280 285 
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Val Ser Glu Leu Glu Ala Asp Leu Ala Glu Gin Gin His Leu Arg 

290 295 300 

Gin Gin Ala Ala Asp Asp Cys Glu Phe Leu Arg Ala Glu Leu Asp 

305 310 315 

Glu Leu Arg Arg Gin Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser 

320 325 330 

Leu Ser Glu lie Glu Arg Lys Ala Gin Ala Asn Glu Gin Arg Tyr 

335 340 345 

Ser Lys Leu Lys Glu Lys Tyr Ser Glu Leu Val Gin Asn His Ala 

350 355 360 

Asp Leu Leu Arg Lys Asn Ala Glu Val Thr Lys Gin Val Ser Met 

365 370 375 

Ala Arg Gin Ala Gin Val Asp Leu Glu Arg Glu Lys Lys Glu Leu 

380 385 390 

Glu Asp Ser Leu Glu Arg lie Ser Asp Gin Gly Gin Arg Lys Thr 

395 400 405 

Gin Glu Gin Leu Glu Val Leu Glu Ser Leu Lys Gin Glu Leu Gly 

410 415 420 

Thr Ser Gin Arg Glu Leu Gin Val Leu Gin Gly Ser Leu Glu Thr 

425 430 435 

Ser Ala Gin Ser Glu Ala Asn Trp Ala Ala Glu Phe Ala Glu Leu 

440 445 450 

Glu Lys Glu Arg Asp Ser Leu Val Ser Gly Ala Ala His Arg Glu 

455 460 465 

Glu Glu Leu Ser Ala Leu Arg Lys Glu Leu Gin Asp Thr Gin Leu 

470 475 480 

Lys Leu Ala Ser Thr Glu Glu Ser Met Cys Gin Leu Ala Lys Asp 

485 490 495 

Gin Arg Lys Met Leu Leu Val Gly Ser Arg Lys Ala Ala Glu Gin 

500 505 510 

Val lie Gin Asp Ala Leu Asn Gin Leu Glu Glu Pro Pro Leu lie 

515 520 525 

Ser Cys Ala Gly Ser Ala Asp His Leu Leu Ser Thr Val Thr Ser 

530 535 540 
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lie Ser Ser Cys lie Glu Gin Leu Glu Lys Ser Trp Ser Gin Tyr 

545 550 555 

Leu Ala Cys Pro Glu Asp lie Ser Gly Leu Leu His Ser lie Thr 

560 565 570 

Leu Leu Ala His Leu Thr Ser Asp Ala lie Ala His Gly Ala Thr 

575 580 585 

Thr Cys Leu Arg Ala Pro Pro Glu Pro Ala Asp Ser Leu Thr Glu 

590 595 600 

Ala Cys Lys Gin Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser 

605 610 615 

Leu Glu Glu Glu Gly Ser Leu Glu Asn Ala Asp Ser Thr Ala Met 

620 625 630 

Arg Asn Cys Leu Ser Lys lie Lys Ala lie Gly Glu Glu Leu Leu 

635 640 645 

Pro Arg Gly Leu Asp lie Lys Gin Glu Glu Leu Gly Asp Leu Val 

650 655 660 

Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu Thr Cys Thr 

665 670 675 

Ala Arg lie Glu Glu Met Leu Ser Lys Ser Arg Ala Gly Asp Thr 

680 685 690 

Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Arg Cys Cys Thr 

695 700 705 

Ser Leu Met Gin Ala lie Gin Val Leu lie Val Ala Ser Lys Asp 

710 715 720 

Leu Gin Arg Glu lie Val Glu Ser Gly Arg Gly Thr Ala Ser Pro 

725 730 735 

Lys Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu lie 

740 745 750 

Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Val Met Val Asp 

765 770 775 

Ala Ala Asp Leu Val Val Gin Gly Arg Gly Lys Phe Glu Glu Leu 

780 785 790 

Met Val Cys Ser His Glu He Ala Ala Ser Thr Ala Gin Leu Val 

795 800 805 
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Ala Ala Ser Lys Val Lys Ala Asp Lys Asp Ser Pro Asn Leu Ala 
810 815 820 



Gin Leu Gin Gin Ala Ser Arg Gly Val Asn Gin Ala Thr Ala Gly 
825 830 835 

Val Val Ala Ser Thr lie Ser Gly Lys Ser Gin lie Glu Glu Thr 
840 845 850 

Asp Asn Met Asp Phe Ser Ser Met Thr Leu Thr Gin lie Lys Arg 
855 860 865 

Gin Glu Met Asp Ser Gin Val Arg Val Leu Glu Leu Glu Asn Glu 
870 875 880 

Leu Gin Lys Glu Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys His 
885 890 895 

Tyr Glu Leu Ala Gly Val Ala Glu Gly Trp Glu Glu Gly Thr Glu 
900 905 910 

Ala Ser Pro Pro Thr Leu Gin Glu Val Val Thr Glu Lys Glu 
915 920 924 



(2) INFORMATION FOR SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Leu Leu Cys Gin Gly Ser Glu Trp Arg Arg Asp Gin Gin Leu 
5 10 15 

Gly Thr Ala Asn Ala Arg Gin Trp Cys Pro Leu Pro Gin Asp Ala 
20 25 30 

Gin Pro Ala Gly Ser Trp Glu Arg Cys Pro Pro Leu Pro Pro Ala 

35 40 45 

Gly Arg Leu Gin Gly Thr Asp His Pro Trp Gly Trp Gly Arg Leu 
50 55 60 
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Ala Gly Gly Gly Glu Arg Gly Gly Leu Trp Glu Gly Leu Ser His 
65 70 75 

Ser Gin Arg Leu lie His Leu lie Leu Leu Ser Leu Pro Leu Leu 
80 85 90 

Val Phe Gin Thr Val Ser lie Asn Lys Ala lie Asn Thr Gin Glu 
95 100 105 

Val Ala Val Lys Glu Lys His Ala Arg Thr Cys lie Leu Gly Thr 
110 115 120 

His His Glu Lys Gly Ala Gin Thr Phe Trp Ser Val Val Asn Arg 
125 130 135 

Leu Pro Leu Ser Ser Asn Ala Val Leu Cys Trp Lys Phe Cys His 
140 145 150 

Val Phe His Lys Leu Leu Arg Asp Gly His Pro Asn Val Leu Lys 
155 160 165 

Asp Ser Leu Arg Tyr Arg Asn Glu Leu Ser Asp Met Ser Arg Met 
170 175 180 

Trp Gly His Leu Ser Glu Gly Tyr Gly Gin Leu Cys Ser lie Tyr 
185 190 195 

Leu Lys Leu Leu Arg Thr Lys Met Glu Tyr His Thr Lys Asn Pro 
200 205 210 

Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg Gin Leu Asp Glu 
215 220 225 

Ala Gly Glu Ser Asp Val Asn Asn Phe Phe Gin Leu Thr Val Glu 
230 235 240 

Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu Phe Gin Thr Val 
245 250 255 

Phe Asn Ser Leu Asp Met Ser Arg Ser Val Ser Val Thr Ala Ala 
260 265 270 

Gly Gin Cys Arg Leu Ala Pro Leu lie Gin Val lie Leu Asp Cys 
275 288 285 



Ser His Leu Tyr Asp 

290 

Ser Cys Leu Pro Ala 
305 



Tyr Thr Val Lys Leu 
295 

Asp Thr Leu Gin Gly 
310 



Leu Phe Lys Leu His 
300 

His Arg Asp Arg Phe 
315 
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Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe Tyr Arg Ser Ser 

320 325 330 

Asn Leu Gin Tyr Phe Lys Arg Leu lie Gin lie Pro Gin Leu Pro 

335 340 345 

Glu Asn Pro Pro Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His 

350 355 360 

He Ser Pro Val Val Val He Pro Ala Glu Ala Ser Ser Pro Asp 

365 370 375 

Ser Glu Pro Val Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala 

380 385 390 

Ser Gin Gin Asn Leu Phe Asp Asn Lys Phe Asp Asp He Phe Gly 

395 400 405 

Ser Ser Phe Ser Ser Asp Pro Phe Asn Phe Asn Ser Gin Asn Gly 

410 415 420 

Val Asn Lys Asp Glu Lys Asp His Leu He Glu Arg Leu Tyr Arg 

425 430 435 

Glu He Ser Gly Leu Lys Ala Gin Leu Glu Asn Met Lys Thr Glu 

440 445 450 

Ser Gin Arg Val Val Leu Gin Leu Lys Gly His Val Ser Glu Leu 

455 460 465 

Glu Ala Asp Leu Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala 

470 475 480 

Asp Asp Cys Glu Phe Leu Arg Ala Glu Leu Asp Glu Leu Arg Arg 

485 490 495 

Gin Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser Leu Ser Glu He 

500 505 510 

Glu Arg Lys Ala Gin Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys 

515 520 525 

Glu Lys Tyr Ser Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg 

530 535 540 

Lys Asn Ala Glu Val Thr Lys Gin Val Ser Met Ala Arg Gin Ala 

545 550 555 

Gin Val Asp Leu Glu Arg Glu Lys Lys Glu Leu Glu Asp Ser Leu 

560 565 570 
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Glu Arg lie Ser Asp Gin Gly Gin Arg Lys Thr Gin Glu Gin Leu 

575 588 585 

Glu Val Leu Glu Ser Leu Lys Gin Glu Leu Ala Thr Ser Gin Arg 

590 595 600 

Glu Leu Gin Val Leu Gin Gly Ser Leu Glu Thr Ser Ala Gin Ser 

605 610 615 

Glu Ala Asn Trp Ala Ala Glu Phe Ala Glu Leu Glu Lys Glu Arg 

620 625 630 

Asp Ser Leu Val Ser Gly Ala Ala His Arg Glu Glu Glu Leu Ser 

635 640 645 

Ala Leu Arg Lys Glu Leu Gin Asp Thr Gin Leu Lys Leu Ala Ser 

650 655 660 

Thr Glu Glu Ser Met Cys Gin Leu Ala Lys Asp Gin Arg Lys Met 

665 670 675 

Leu Leu Val Gly Ser Arg Lys Ala Ala Glu Gin Val lie Gin Asp 

680 685 690 

Ala Leu Asn Gin Leu Glu Glu Pro Pro Leu lie Ser Cys Ala Gly 

695 700 705 

Ser Ala Asp His Leu Leu Ser Thr Val Thr Ser lie Ser Ser Cys 

710 715 720 

lie Glu Gin Leu Glu Lys Ser Trp Ser Gin Tyr Leu Ala Cys Pro 

725 730 735 

Glu Asp lie Ser Gly Leu Leu His Ser lie Thr Leu Leu Ala His 

740 745 750 

Leu Thr Ser Asp Ala lie Ala His Gly Ala Thr Thr Cys Leu Arg 

755 760 765 

Ala Pro Pro Glu Pro Ala Asp Ser Leu Thr Glu Ala Cys Lys Gin 

770 775 780 

Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser Leu Glu Glu Glu 

785 790 795 

Gly Ser Leu Glu Asn Ala Asp Ser Thr Ala Met Arg Asn Cys Leu 

800 805 810 

Ser Lys He Lys Ala He Gly Glu Glu Leu Leu Pro Arg Gly Leu 

815 820 825 
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Asp lie Lys Gin Glu Glu Leu Gly Asp Leu Val Asp Lys Glu Met 

830 835 840 

Ala Ala Thr Ser Ala Ala lie Glu Thr Ala Thr Ala Arg lie Glu 

845 850 855 

Glu Met Leu Ser Lys Ser Arg Ala Gly Asp Thr Gly Val Lys Leu 

860 865 870 

Glu Val Asn Glu Arg lie Leu Gly Cys Cys Thr Ser Leu Met Gin 

875 888 885 

Ala lie Gin Val Leu lie Val Ala Ser Lys Asp Leu Gin Arg Glu 

890 895 900 

He Val Glu Ser Gly Arg Gly Thr Ala Ser Pro Lys Glu Phe Tyr 

905 910 915 

Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu He Ser Ala Ser Lys 

920 925 930 

Ala Val Gly Trp Gly Ala Thr Val Met Val Asp Ala Ala Asp Leu 

935 940 945 

Val Val Gin Gly Arg Gly Lys Phe Glu Glu Leu Met Val Cys Ser 

950 955 960 

His Glu He Ala Ala Ser Thr Ala Gin Leu Val Ala Ala Ser Lys 

965 970 975 

Val Lys Ala Asp Lys Asp Ser Pro Asn Leu Ala Gin Leu Gin Gin 

980 985 990 

Ala Ser Arg Gly Val Asn Gin Ala Thr Ala Gly Val Val Ala Ser 

995 1000 1005 

Thr He Ser Gly Lys Ser Gin He Glu Glu Thr Asp Asn Met Asp 

1010 1015 1020 

Phe Ser Ser Met Thr Leu Thr Gin He Lys Arg Gin Glu Met Asp 

1025 1030 1035 

Ser Gin Val Arg Val Leu Glu Leu Glu Asn Glu Leu Gin Lys Glu 

1040 1045 1050 

Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys His Tyr Glu Leu Ala 

1055 1060 1065 

Gly Val Ala Glu Gly Trp Glu Glu Gly Thr Glu Ala Ser Pro Pro 

1070 1075 1080 
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Thr Leu Gin Glu Val Val Thr Glu Lys Glu 
1085 1090 



(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3301 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntinglin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CGGTGAGCTG GAGGAGCAGC GGAAGCAGAA GCAGAAGGCC CTGGTGGATA 50 

ATGAGCAGCT CCGCCACGAG CTGGCCCAGC TGAGGGCTGC CCAGCTGGAG 100 

CGCGAGCGGA GCCAGGGCCT GCGTGAGGAG GCTGAGAGGA AGGCCAGTGC 150 

CACGGAGGCG CGCTACAACA AGCTGAAGGA AAAGCACAGT GAGCTCGTCC 200 

ATGTGCACGC GGAGCTGCTC AGAAAGAACG CGGACACAGC CAAGCAGCTG 250 

ACGGTGACGC AGCAAAGCCA GGAGGAGGTG GCGCGGGTGA AGGAGCAGCT 3 00 

GGCCTTCCAG GTGGAGCAGG TGAAGCGGGA GTCGGAGTTG AAGCTAGAGG 3 50 

AGAAGAGCGA CCAGCAGGAG AAGCTCAAGA GGGAGCTGGA GGCCAAGGCC 400 

GGAGAGCTGG CCCGCGCGCA GGAGGCCCTG AGCCACACAG AGCAGAGCAA 450 

GTCGGAGCTG AGCTCACGGC TGGACACACT GAGTGCGGAG AAGGATGCTC 500 

TGAGTGGAGC TGTGCGGCAG CGGGAGGCAG ACCTGCTGGC GGCGCAGAGC 550 

CTGGTGCGCG AGACAGAGGC GGCGCTGAGC CGGGAGCAGC AGCGCAGCTC 600 

CCAGGAGCAG GGCGAGTTGC AGGGCCGGCT GGCAGAGAGG GAGTCTCAGG 650 

AGCAGGGGCT GCGGCAGAGG CTGCTGGACG AGCAGTTCGC AGTGTTGCGG 700 

GGCGCTGCTG CCGAGGCCGC GGGCATCCTG CAGGATGCCG TGAGCAAGCT 750 

GGACGACCCC CTGCACCTGC GCTGTACCAG CTCCCCAGAC TACCTGGTGA 800 

GCAGGGCCCA GGAGGCCTTG GATGCCGTGA GCACCCTGGA GGAGGGCCAC 850 

GCCCAGTACC TGACCTCCTT GGCAGACGCC TCCGCCCTGG TGGCAGCTCT 900 

GACCCGCTTC TCCCACCTGG CTGCGGATAC CATCATCAAT GGCGGTGCCA 950 

CCTCGCACCT GGCTCCCACC GACCCTGCCG ACCGCCTCAT AGACACCTGC 1000 

AGGGAGTGCG GGGCCCGGGC TCTGGAGCTC ATGGGGCAGC TGCAGGACCA 1050 

GCAGGCTCTG CGGCACATGC AGGCCAGCCT GGTGCGGACA CCCCTGCAGG 1100 

GCATCCTTCA GCTGGGCCAA GAACTGAAAC CCAAGAGCCT AGATGTGCGG 1150 

CAGGAGGAGC TGGGGGCCGT GGTCGACAAG GAGATGGCGG CCACATCCGC 1200 

AGCCATTGAA GATGCTGTGC GGAGGATTGA GGACATGATG AACCAGGCAC 1250 

GCCACGCCAG CTCGGGGGTG AAGCTGGAGG TGAACGAGAG GATCCTCAAC 1300 

TCCTGCACAG ACCTGATGAA GGCTATCCGG CTCCTGGTGA CGACATCCAC 1350 

TAGCCTGCAG AAGGAGATCG TGGAGAGCGG CAGGGGGGCA GCCACGCAGC 1400 

AGGAATTTTA CGCCAAGAAC TCGCGCTGGA CCGAAGGCCT CATCTCGGCC 1450 

TCCAAGGCTG TGGGCTGGGG AGCCACACAG CTGGTGGAGG CAGCTGACAA 1500 

GGTGGTGCTT CACACGGGCA AGTATGAGGA GCTCATCGTC TGCTCCCACG 1550 

AGATCGCAGC CAGCACGGCC CAGCTGGTGG CGGCCTCCAA GGTGAAGGCC 1600 
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AACAAGCACA GCCCCCACCT GAGCCGCCTG CAGGAATGTT CTCGCACAGT 1650 

CAATGAGAGG GCTGCCAATG TGGTGGCCTC CACCAAGTCA GGCCAGGAGC 1700 

AGATTGAGGA CAGAGACACC ATGGATTTCT CCGGCCTGTC CCTCATCAAG 1750 

CTGAAGAAGC AGGAGATGGA GACGCAGGTG CGTGTCCTGG AGCTGGAGAA 1800 

GACGCTGGAG GCTGAACGCA TGCGGCTGGG GGAGTTGCGG AAGCAACACT 1850 

ACGTGCTGGC TGGGGCATCA GGCAGCCCTG GAGAGGAGGT GGCCATCCGG 1900 

CCCAGCACTG CCCCCCGAAG TGTAACCACC AAGAAACCAC CCCTGGCCCA 1950 

GAAGCCCAGC GTGGCCCCCA GACAGGACCA CCAGCTTGAC AAAAAGGATG 2000 

GCATCTACCC AGCTCAACTC GTGAACTACT AGGCCCCCCA GGGGTCCAGC 2050 

AGGGTGGCTG GTGACAGGCC TGGGCCTCTG CAACTGCCCT GACAGGACCG 2100 

AGAGGCCTTG CCCCTCCACC TGGTGCCCAA GCCTCCCGCC CCACCGTCTG 2150 

GATCAATGTC CTCAAGGCCC CTGGCCCTTA CTGAGCCTGC AGGGTCCTGG 2200 

GCCATGTGGG TGGTGCTTCT GGATGTGAGT CTCTTATTTA TCTGCAGAAG 2250 

GAACTTTGGG GTGCAGCCAG GACCCGGTAG GCCTGAGCCT CAACTCTTCA 23 00 

GAAAATAGTG TTTTTAATAT TCCTCTTCAG AAAATAGTGT TTTTAATATT 2350 

CCGAGCTAGA GCTCTTCTTC CTACGTTTGT AGTCAGCACA CTGGGAAACC 2400 

GGGCCAGCGT GGGGCTCCCT GCCTTCTGGA CTCCTGAAGG TCGTGGATGG 2450 

ATGGAAGGCA CACAGCCCGT GCCGGCTGAT GGGACGAGGG TCAGGCATCC 2500 

TGTCTGTGGC CTTCTGGGGC ACCGATTCTA CCAGGCCCTC CAGCTGCGTG 2550 

GTCTCCGCAG ACCAGGCTCT GTGTGGGCTA GAGGAATGTC GCCCATTACC 2600 

TCCTCAGGCC CTGGCCCTCG GGCCTCCGTG ATGGGAGCCC CCCAGGAGGG 2700 

GTCAGATGCT GGAAGGGGCC GCTTTCTGGG GAGTGAGGTG AGACATAGCG 2750 

GCCCAGGCGC TGCCTTCACT CCTGGAGTTT CCATTTCCAG CTGGAATCTG 2800 

CAGCCACCCC CATTTCCTGT TTTCCATTCC CCCGTTCTGG CCGCGCCCCA 2850 

CTGCCCACCT GAAGGGGTGG TTTCCAGCCC TCCGGAGAGT GGGCTTGGCC 2900 

CTAGGCCCTC CAGCTCAGCC AGAAAAAGCC CAGAAACCCA GGTGCTGGAC 2950 

CAGGGCCCTC AGGGAGGGAC CCTGCGGCTA GAGTGGGCTA GGCCCTGGCT 3 000 

TTGCCCGTCA GATTTGAACG AATGTGTGTC CCTTGAGCCC AAGGAGAGCG 3050 

GCAGGAGGGG TGGGACCAGG CTGGGAGGAC AGAGCCAGCA GCTGCCATGC 3100 

CCTCCTGCTC CCCCCACCCC AGCCCTAGCC CTTTAGCCTT TCACCCTGTG 3150 

CTCTGGAAAG GCTACCAAAT ACTGGCCAAG GTCAGGAGGA GCAAAAATGA 3200 

GCCAGCACCA GCGCCTTGGC TTTGTGTTAG CATTTCCTCC TGAAGTGTTC 3250 

TGTTGGCAAT AAAATGCACT TTGACTGTTA AAAAAAAAAA AAAAAAAAAA 3300 

A 3301 



(2) INFORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOU)GY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Gly Glu Leu Glu Glu Gin Arg Lys Gin Lys Gin Lys Ala Leu Val 
5 10 15 
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Asp Asn Glu Gin Leu Arg His Glu Leu Ala Gin Leu Arg Ala Ala 

20 25 30 

Gin Leu Glu Arg Glu Arg Ser Gin Gly Leu Arg Glu Glu Ala Glu 

35 40 45 

Arg Lys Ala Ser Ala Thr Glu Ala Arg Tyr Asn Lys Leu Lys Glu 

50 55 60 

Lys His Ser Glu Leu Val His Val His Ala Glu Leu Leu Arg Lys 

65 70 75 

Asn Ala Asp Thr Ala Lys Gin Leu Thr Val Thr Gin Gin Ser Gin 

80 85 90 

Glu Glu Val Ala Arg Val Lys Glu Gin Leu Ala Phe Gin Val Glu 

95 100 105 

Gin Val Lys Arg Glu Ser Glu Leu Lys Leu Glu Glu Lys Ser Asp 

110 115 120 

Gin Gin Glu Lys Leu Lys Arg Glu Leu Glu Ala Lys Ala Gly Glu 

125 130 135 

Leu Ala Arg Ala Gin Glu Ala Leu Ser His Thr Glu Gin Ser Lys 

140 145 150 

Ser Glu Leu Ser Ser Arg Leu Asp Thr Leu Ser Ala Glu Lys Asp 

155 160 165 

Ala Leu Ser Gly Ala Val Arg Gin Arg Glu Ala Asp Leu Leu Ala 

170 175 180 

Ala Gin Ser Leu Val Arg Glu Thr Glu Ala Ala Leu Ser Arg Glu 

185 190 195 

Gin Gin Arg Ser Ser Gin Glu Gin Gly Glu Leu Gin Gly Arg Leu 

200 205 210 

Ala Glu Arg Glu Ser Gin Glu Gin Gly Leu Arg Gin Arg Leu Leu 

215 220 225 

Asp Glu Gin Phe Ala Val Leu Arg Gly Ala Ala Ala Glu Ala Ala 

230 235 240 

Gly lie Leu Gin Asp Ala Val Ser Lys Leu Asp Asp Pro Leu His 

245 250 255 

Leu Arg Cys Thr Ser Ser Pro Asp Tyr Leu Val Ser Arg Ala Gin 

260 265 270 
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Glu Ala Leu Asp Ala Val Ser Thr Leu Glu Glu Gly His Ala Gin 
275 288 285 

Tyr Leu Thr Ser Leu Ala Asp Ala Ser Ala Leu Val Ala Ala Leu 
290 295 300 

Thr Arg Phe Ser His Leu Ala Ala Asp Thr lie lie Asn Gly Gly 
305 310 315 

Ala Thr Ser His Leu Ala Pro Thr Asp Pro Ala Asp Arg Leu lie 
320 325 330 

Asp Thr Cys Arg Glu Cys Gly Ala Arg Ala Leu Glu Leu Met Gly 
335 340 345 

Gin Leu Gin Asp Gin Gin Ala Leu Arg His Met Gin Ala Ser Leu 
350 355 360 

Val Arg Thr Pro Leu Gin Gly He Leu Gin Leu Gly Gin Glu Leu 
365 370 375 

Lys Pro Lys Ser Leu Asp Val Arg Gin Glu Glu Leu Gly Ala Val 
380 385 390 

Val Asp Lys Glu Met Ala Ala Thr Ser Ala Ala He Glu Asp Ala 
395 400 405 

Val Arg Arg He Glu Asp Met Met Asn Gin Ala Arg His Ala Ser 
410 415 420 

Ser Gly Val Lys Leu Glu Val Asn Glu Arg He Leu Asn Ser Cys 
425 430 435 

Thr Asp Leu Met Lys Ala He Arg Leu Leu Val Thr Thr Ser Thr 
440 445 450 

Ser Leu Gin Lys Glu He Val Glu Ser Gly Arg Gly Ala Ala Thr 
455 460 465 

Gin Gin Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu 
470 475 480 

He Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Gin Leu Val 
485 490 495 

Glu Ala Ala Asp Lys Val Val Leu His Thr Gly Lys Tyr Glu Glu 
500 505 510 

Leu He Val Cys Ser His Glu He Ala Ala Ser Thr Ala Gin Leu 
515 520 525 
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Val Ala Ala Ser Lys Val Lys Ala Asn Lys His Ser Pro His Leu 

530 535 540 

Ser Arg Leu Gin Glu Cys Ser Arg Thr Val Asn Glu Arg Ala Ala 

545 550 555 

Asn Val Val Ala Ser Thr Lys Ser Gly Gin Glu Gin lie Glu Asp 

560 565 570 

Arg Asp Thx Met Asp Phe Ser Gly Leu Ser Leu lie Lys Leu Lys 

575 588 585 

Lys Gin Glu Met Glu Thr Gin Val Arg Val Leu Glu Leu Glu Lys 

590 595 600 

Thr Leu Glu Ala Glu Arg Met Arg Leu Gly Glu Leu Arg Lys Gin 

605 610 615 

His Tyr Val Leu Ala Gly Ala Ser Gly Ser Pro Gly Glu Glu Val 

620 625 630 

Ala lie Arg Pro Ser Thr Ala Pro Arg Ser Val Thr Thr Lys Lys 

635 640 645 

Pro Pro Leu Ala Gin Lys Pro Ser Val Ala Pro Arg Gin Asp His 

650 655 660 

Gin Leu Asp Lys Lys Asp Gly lie Tyr Pro Ala Gin Leu Val Asn 

665 670 675 

Tyr 



(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 2338 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
P) TOPOLOGY: linear 

(ii) MOLEClJLE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: cDNA for Huntingtin-interacting protein - mHIPl 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGCACGAGGG CTCATTCAGA TCCCCCAGCT GCCCGAGAAT CCACCCAACTT 50 
CCTACGAGCC TCGGCCCTGT CAGAGCACAT CAGTCCTGTG GTGGTGATCCC 100 
GGCAGAGGTG TCATCCCCAG ACAGTGAGCC TGTCCTGGAG AAGGATGACCT 150 
CATGGACATG GACGCCTCCC AGCAGACTTT GTTTGACAAC AAGTTTGATGA 200 
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CGTCTTTGGC AGCTCATTGA 
TGGCGTGAAC AAGGACGAGA 
GATCAGTGGA CTGACAGGGC 
GGCCATGCTG CAGCTGAAGG 
AGAGCAGCAG CACTTGGGCC 
CACTGAGCTG GATGAACTGA 
GCGCAGCCTG ACTGAGATAG 
TAGCAAGTTA AAAGAGAAGT 
GCTGCGGAAG AACGCAGAGG 
CCAGGTGGAT TTGGAAAGAG 
GTGTAAGTGA CCAGGCCCAG 
GAGAACCTGA AGCATGAACT 
CCACAGCAAC CTGGAAACCT 
AGATCGCCGA GTTGGAGAAG 
CAGAGAGAGG AAGAGTTATC 
GATCAAGCTG GCTGGGGCCC 
AGAGGAAAAC CCTCTTGGCA 
CAGGAGGCGC TGAGCCAGCT 
ATCCACAGAT CACCTTCTCT 
AGCAACTGGA AAAGAACGGC 
AGTGAGCTTC TGCACTCGAT 
TGTCATCCAG GGGAGTGCCA 
ACTCGTTGAC GGAGGCCTGT 
CTGTCCTCCC TGGAGGAAGA 
CCTTAGGAAT TGCCTCAGCA 
CCAGGGGCCT GGACATCAAG 
GAGATGGCAG CCACTTCAGC 
GGAAATTCTC AGTAAGTCCC 
TGAATGAGAG GATCCTGGGT 
GTGCTCGTTG TGGCCTCCAA 
CAGGGGTAGT GCATCCCCTA 
CGGAAGGGCT GATATCCGCC 
ATGGTGGATG CTGCTGATCT 
GCTGATGGTG TGTTCACGCG 
CTGCATCCAA GGTGAAAGCG 
CAGCAGGCCT CTCGAGGAGT 
AACCATTTCT GGCAAATCTC 
CAAGCATGAC ACTGACCCAG 
AGGGTGCTGG AGCTGGAAAA 
AGAGCTACGG AAGAAACACT 
AGGAAGGGAC AGAAGCATCA 
AAAGAGTAGA GCCAAGCCGA 



GCAGCGACCC TTTCAATTTC 
AGGACCACTT GATTGAACGC 
AGCTGGACAA CATGAAGATT 
GTCGAGTGAG TGAGCTGGAG 
GGCAGGCTAT GGATGACTGC 
AGAGGCAGCG AGAGGACACG 
AAAGAAAGGC CCAGGCTAAT 
ACAGTGAACT GGTGCAGAAC 
TGACCAAACA GGTGTCCGTG 
AGAAAAAAGA GCTAGCAGAT 
CGGAAGACTC AAGAGCAACA 
GGCCACCAGC AGACAGGAGC 
CTGCCCAGTC AGAAGCGAAA 
GAACAAGGCA GCTTGGCGAC 
AGCCCTCCGA GACCAGCTGG 
AGGAATCCAT GTGCCAGCAG 
GGGATCAGGA AGGCTGCGGA 
TGAGGT^CCC ACCCTCATCA 
CCAAAGTCAG CTCCGTTTCC 
AGCCAGTATC TGGCCTGCCC 
CACCCTGCTT GCCCACTTGA 
CCAGCCTCCG GGCCCCACCG 
AGGCAGTATG GCAGAGAAAC 
GGGAACTGTG GAGAATGCTG 
GGGTCAAGAC CCTTGGCGAG 
CAGGAAGAGC TGGGTGACCT 
TGCCATTGAA GCTGCCACCA 
GAGCAGGAGA CACGGGAGTC 
TCCTGTACCA GCCTGATGCA 
GGACCTCCAG AAGGAGATAG 
AAGAATTTTA CGCCAAGAAC 
TCCAAAGCTG TTGGTTGGGG 
TGTGGTCCAA GGCAAAGGGA 
AGATTGCTGC CAGTACTGCC 
AACAAGGGCA GCCTCAATCT 
GAACCAGGCC ACAGCCGCTG 
AGATTGAGGA AACAGACAGT 
ATCAAGCGCC AGGAGATGGA 
TGACCTGCAG AAGGAGCGTC 
ACGAGCTGGA GGGCGTGGCT 
CCGTCTACTG TCCAAGAAGC 
CACCCCACAC ATCAGAAA 
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GAGTTCCTGCG 450 


GAGAAGGCACA 500 


GAACAGAGGTA 550 


CATGCTGACCT 600 


GCCCGGCAAGC 650 


ICC 1 1 IXjCAL 


7 00 


r^r^ A rn/^rnm/^rn A 

CKjA 1\j i i L. 1 A 


T c n 


tr^r^r^ a /^/^rno/^m 
i LtCACjVj i LL- i 


oUU 


rrir^r^r^rrsr^ A A 

1 CjCjL 1 VjACAL 


850 


TGTTGCAGCT 


900 


AAAGCACCCA 


950 


GTGAAGG AC C 


1000 


GCGTGAGATA 


1050 


GCTGTGCAGG 


1100 


AGCTGCCTCG 


1150 


AGAAGATATT 


1200 


CCGGTGACAC 


1250 


GAGCCAGCCG 


1300 


CCTGGCCTAT 


1350 


ACGTCACAGC 


1400 


GAGCTGCTGC 


1450 


CjrCj i CjG AL AACjT 


1500 


CCCGGATAGA 


1550 


AAGCTGGAGG 


1600 


GGCCATCAAG 


1650 


TGGAGAGTGG 


1700 


TCTCGGTGGA 


1750 


AGCTACCATC 


1800 


AGTTCGAGGA 


1850 


CAGCTCGTGG 


1900 


GACCCAGCTG 


2000 


TGGTGGCCTC 


2050 


ATGGACTTCT 


2100 


TTCCCAGGTT 


2150 


AGAAACTAGG 


2200 


GAGGGCTGGG 


2250 


AATACCGGAC 


2300 




2338 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 
(ix) FEATURE: Hunlingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Ala Arg Gly Leu He Gin He Pro Gin Leu Pro Glu Asn Pro Pro 
5 10 15 

Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His He Ser Pro Val 
20 25 30 

Val Val He Pro Ala Glu Val Ser Ser Pro Asp Ser Glu Pro Val 
35 40 45 

Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala Ser Gin Gin Thr 
50 55 60 

Leu Phe Asp Asn Lys Phe Asp Asp Val Phe Gly Ser Ser Leu Ser 
65 70 75 

Ser Asp Pro Phe Asn Phe Asn Asn Gin Asn Gly Val Asn Lys Asp 
80 85 90 

Glu Lys Asp His Leu He Glu Arg Leu Tyr Arg Glu He Ser Gly 

95 100 105 

Leu Thr Gly Gin Leu Asp Asn Met Lys He Glu Ser Gin Arg Ala 

110 115 120 

Met Leu Gin Leu Lys Gly Arg Val Ser Glu Leu Glu Ala Glu Leu 

125 130 135 

Ala Glu Gin Gin His Leu Gly Arg Gin Ala Met Asp Asp Cys Glu 

140 145 150 

Phe Leu Arg Thr Glu Leu Asp Glu Leu Lys Arg Gin Arg Glu Asp 

155 160 165 

Thr Glu Lys Ala Gin Arg Ser Leu Thr Glu He Glu Arg Lys Ala 

170 175 180 

Gin Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys Glu Lys Tyr Ser 

185 190 195 

Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg Lys Asn Ala Glu 

200 205 210 

Val Thr Lys Gin Val Ser Val Ala Arg Gin Ala Gin Val Asp Leu 

215 220 225 

Glu Arg Glu Lys Lys Glu Leu Ala Asp Ser Phe Ala Arg Val Ser 
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230 235 240 

Asp Gin Ala Gin Arg Lys Thr Gin Glu Gin Gin Asp Val Leu Glu 
245 250 255 

Asn Leu Lys His Glu Leu Ala Thr Ser Arg Gin Glu Leu Gin Val 
260 265 270 

Leu His Ser Asn Leu Glu Thr Ser Ala Gin Ser Glu Ala Lys Trp 
275 288 285 

Leu Thr Gin lie Ala Glu Leu Glu Lys Glu Gin Gly Ser Leu Ala 
290 295 300 

Thr Val Ala Ala Gin Arg Glu Glu Glu Leu Ser Ala Leu Arg Asp 
305 310 315 

Gin Leu Glu Ser Thr Gin lie Lys Leu Ala Gly Ala Gin Glu Ser 
320 325 330 

Met Cys Gin Gin Val Lys Asp Gin Arg Lys Thr Leu Leu Ala Gly 
335 340 345 

lie Arg Lys Ala Ala Glu Arg Glu He Gin Glu Ala Leu Ser Gin 
350 355 360 

Leu Glu Glu Pro Thr Leu lie Ser Cys Ala Gly Ser Thr Asp His 
365 370 375 

Leu Leu Ser Lys Val Ser Ser Val Ser Ser Cys Leu Glu Gin Leu 
380 385 390 

Glu Lys Asn Gly Ser Gin Tyr Leu Ala Cys Pro Glu Asp He Ser 
395 400 405 

Glu Leu Leu His Ser He Thr Leu Leu Ala His Leu Thr Gly Asp 
410 415 420 

Thr Val He Gin Gly Ser Ala Thr Ser Leu Arg Ala Pro Pro Glu 
425 430 435 

Pro Ala Asp Ser Leu Thr Glu Ala Cys Arg Gin Tyr Gly Arg Glu 
440 445 450 

Thr Leu Ala Tyr Leu Ser Ser Leu Glu Glu Glu Gly Thr Val Glu 
455 460 465 



Asn Ala Asp Val Thr Ala Leu Arg Asn Cys Leu Ser Arg Val Lys 
470 475 480 



22 



wo 99/60986 PCT/US99/11743 

Thr Leu Gly Glu Glu Leu Leu Pro Arg Gly Leu Asp lie Lys Gin 

485 490 495 

Glu Glu Leu Gly Asp Leu Val Asp Lys Glu Met Ala Ala Thr Ser 

500 505 510 

Ala Ala He Glu Ala Ala Thr Thr Arg He Glu Glu lie Leu Ser 

515 520 525 

Lys Ser Arg Ala Gly Asp Thr Gly Val Lys Leu Glu Val Asn Glu 

530 535 540 

Arg He Leu Gly Ser Cys Thr Ser Leu Met Gin Ala He Lys Val 

545 550 555 

Leu Val Val Ala Ser Lys Asp Leu Gin Lys Glu He Val Glu Ser 

560 565 570 

Gly Arg Gly Ser Ala Ser Pro Lys Glu Phe Tyr Ala Lys Asn Ser 

575 588 585 

Arg Trp Thr Glu Gly Leu He Ser Ala Ser Lys Ala Val Gly Trp 

590 595 600 

Gly Ala Thr He Met Val Asp Ala Ala Asp Leu Val Val Gin Gly 

605 610 615 

Lys Gly Lys Phe Glu Glu Leu Met Val Cys Ser Arg Glu He Ala 

620 625 630 

Ala Ser Thr Ala Gin Leu Val Ala Ala Ser Lys Val Lys Ala Asn 

635 640 645 

Lys Gly Ser Leu Asn Leu Thr Gin Leu Gin Gin Ala Ser Arg Gly 

650 655 660 

Val Asn Gin Ala Thr Ala Ala Val Val Ala Ser Thr He Ser Gly 

665 670 675 

Lys Ser Gin He Glu Glu Thr Asp Ser Met Asp Phe Ser Ser Met 

680 685 690 

Thr Leu Thr Gin He Lys Arg Gin Glu Met Asp Ser Gin Val Arg 

695 700 705 

Val Leu Glu Leu Glu Asn Asp Leu Gin Lys Glu Arg Gin Lys Leu 

710 715 720 

Gly Glu Leu Arg Lys Lys His Tyr Glu Leu Glu Gly Val Ala Glu 

725 730 735 
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Gly Trp Glu Glu Gly Thr Glu Ala Ser Pro Ser Thr Val Gin Glu 
740 745 750 

Ala lie Pro Asp Lys Glu 
755 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3964 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: cDNA for Huntingtin-interacting protein - mHIPla 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGCACGAGGC GGCGCGCGGC CTCCGTGTGC CTAGGCTTGA GGCGGGCGGT 50 

GACGCCTCAT TCGCGCGGAG CCGGGCCGGG ACACGGTCGG CGGCAGCATG 100 

AACAGCATCA AGAATGTGCC GGCGCGGGTG CTGAGCCGCA GGCCGGGCCA 150 

CAGCCTAGAG GCCGAGCGCG AGCAGTTCGA CAAGACGCAG GCCATCAGTA 200 

TCAGCAAAGC CATCAACAGC CAGGAGGCCC CAGTGAAGGA GAAGCATGCC 2 50 

CGGCGTATCA TCCTGGGCAC GCATCATGAG AAGGGAGCCT TCACCTTCTG 300 

GTCCTATGCC ATCGGCCTGC CGCTGTCCAG CAGCTCCATC CTCAGCTGGA 350 

AGTTCTGTCA CGTCCTTCAC AAGGTCCTCC GGGACGGACA CCCCAACGTC 400 

CTGCATGACT ATCAGCGGTA CCGGAGCAAC ATACGTGAGA TCGGTGACTT 450 

GTGGGGCCAC CTTCGTGACC AGTATGGACA CCTGGTGAAT ATCTATACCA 500 

AACTGTTGCT GACTAAGATC TCCTTCCACC TTAAGCACCC CCAGTTTCCT 550 

GCAGGCCTGG AGGTAACAGA TGAGGTGTTG GAGAAGGCGG CGGGAACTGA 600 

TGTCAACAAC ATTTTTCAGC TTACCGTGGA GATGTTTGAC TACATGGACT 650 

GTGAACTGAA GCTTTCTGAG TCAGTTTTCC GGCAGCTCAA CACGGCCATC 700 

GCAGTGTCCC AGATGTCTTC TGGCCAGTGT CGCCTAGCGC CGCTCATCCA 750 

GGTCATTCAG GACTGCAGCC ACCTGTACCA CTACACAGTG AAGCTCATGT 800 

TTAAGCTGCA CTCCTGTCTC CCGGCAGACA CCCTGCAAGG CCACAGGGAT 850 

CGGTTCCACG AGCAGTTCGA CAGCCTCAAA AACTTCTTCC GCCGGGCTTC 900 

AGACATGCTG TACTTCAAGA GGCTCATCCA GATCCCGCGG CTGCCTGAGG 950 

GACCCCCCAA TTTCCTGCGG GCTTCAGCCC TGGCTGAGCA CATCAAGCCG 1000 

GTGGTGGTGA TTCCCGAGGA GGCCCCAGAG GAAGAGGAGC CTGAGAACCT 1050 

AATTGAAATC AGCAGTGCGC CCCCTGCTGG GGAGCCAGTG GTGGTGGCTG 1100 

ACCTCTTTGA TCAGACCTTT GGACCCCCCA ATGGCTCCAT GAAGGATGAC 1150 

AGGGACCTCC AAATCGAGAA CTTGAAGAGA GAGGTGGAGA CCCTCCGTGC 1200 

TGAGCTGGAG AAGATTAAGA TGGAGGCACA GCGGTACATC TCCCAGCTGA 1250 

AGGGCCAGGT GAATGGCCTG GAGGCAGAGC TGGAGGAGCA GCGCAAGCAG 1300 

AAGCAGAAGG CCCTGGTGGA CAACGAGCAG CTGCGCCACG AGCTGGCCCA 1350 

GCTCAAGGCC CTGCAGCTGG AGGGCGCCCG CAACCAGGGC CTTCGAGAGG 1400 

AAGCAGAGAG GAAGGCCAGT GCCACGGAGG CACGCTACAG CAAGCTGAAG 1450 

GAGAAACACA GCGAACTCAT TAACACGCAC GCCGAGCTGC TCAGGAAGAA 1500 
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CGCAGACACG GCCAAGCAGC TGACAGTGAC ACAGCAGAGC CAGGAGGAGG 1550 
TGGCACGGGT AAAGGAACAG CTGGCCTTCC AGATGGAGCA AGCGAAGCGT 1600 
GAGTCTGAGA TGAAGATGGA AGAGCAGAGC GACCAGTTGG AGAAGCTCAA 1650 
GAGGGAGCTG GCGGCCAGGG CAGGAGAGCT GGCCCGTGCG CAGGAGGCCC 17 00 
TGAGCCGCAC AGAACAGAGT GGGTCAGAGC TGAGCTCACG GCTGGACACA 1750 
CTGAACGCGG AGAAGGAAGC CCTGAGTGGA GTCGTTCGGC AGCGTGAGGC 1800 
AGAGCTGCTG GCCGCTCAGA GCCTGGTGCG GGAGAAGGAG GAGGCGCTTA 1850 
GCCAAGAGCA GCAGCGGAGC TCCCAGGAGA AGGGCGAGCT ACGGGGGCAG 1900 
CTGGCAGAAA AGGAGTCTCA GGAGCAGGGG CTTCGGCAGA AGCTGCTGGA 1950 
TGAGCAGTTG GCGGTGTTGC GAAGTGCAGC CGCCGAGGCA GAGGCCATCC 2000 
TACAGGATGC AGTGAGCAAG CTGGACGACC CCCTGCACCT CCGCTGCACC 2050 
AGCTCCCCAG ACTACTTGGT GAGCCGGGCT CAGGCAGCCC TGGACAGCGT 2100 
GAGCGGCCTG GAGCAGGGCC ACACCCAGTA CCTGGCTTCC TCCGAAGATG 2150 
CTTCTGCCCT GGTGGCAGCG CTGACCCGCT TCTCCCATTT GGCTGCGGAC 2200 
ACCATTGTCA ATGGTGCCGC CACCTCCCAC CTGGCCCCCA CCGACCCCGC 2250 
CGACCGCCTG ATGGACACAT GCAGGGAGTG TGGAGCCCGG GCTCTGGAGC 2 300 
TGGTGGGACA GCTGCAAGAC CAGACAGTGC TACGGAGGGC TCAGCCCAGC 2350 
CTGATGCGGG CCCCCCTGCA GGGCATTCTG CAGTTGGGCC AGGACTTGAA 2400 
GCCTAAGAGC CTGGATGTAC GGCAAGAGGA GCTAGGGGCC ATGGTGGACA 2450 
AGGAGATGGC GGCCACCTCG GCAGCCATTG AGGACGCTGT GCGGAGGATC 2 500 
GAGGACATGA TGAGCCAGGC CCGCCACGAG AGCTCAGGCG TGAAACTGGA 2550 
GGTGAATGAG AGGATCCTCA ACTCCTGCAC AGACCTGATG AAGGCTATCC 2600 
GGCTCCTGGT GATGACCTCC ACCAGCCTGC AGAAGGAAAT TGTGGAGAGC 2650 
GGCAGGGGGG CAGCAACGCA GCAGGAATTT TATGCCAAGA ATTCACGGTG 2700 
GACTGAAGGC CTCATCTCAG CCTCTAAGGC AGTGGGCTGG GGAGCCACAC 2750 
AGCTGGTGGA GTCAGCTGAC AAGGTTGTGC TTCACATGGG CAAATACGAG 2800 
GAACTCATCG TCTGCTCCCA TGAGATTGCG GCCAGCACGG CCCAGCTGGT 2850 
GGCAGCCTCG AAGGTGAAAG CCAACAAGAA CAGTCCCCAC TTGAGCCGCC 2900 
TGCAGGAATG TTCCCGCACT GTCAACGAGA GGGCTGCCAA CGTCGTGGCC 2950 
TCCACCAAAT CTGGCCAGGA GCAGATTGAG GACAGAGACA CCATGGATTT 3000 
CTCTGGCCTG TCCCTCATCA AGTTGAAGAA GCAGGAGATG GAGACACAGG 3050 
TGCGAGTCTT GGAGCTGGAG AAGACACTAG AGGCAGAGCG TGTCCGGCTC 3100 
GGGGAGCTTC GGAAACAGCA CTATGTACTG GCTGGGGGGA TGGGAACACC 3150 
TAGCGAAGAA GAACCCAGCA GACCCAGCCC AGCTCCCCGA AGTGGGGCCA 3200 
CTAAGAAGCC ACCGCTGGCC CAGAAACCCA GCATAGCCCC CAGGACAGAC 3250 
AACCAGCTCGA CAAAAAGGAT GGTGTCTACC CAGCTCAACT TGTGAACTAC 3300 
TAGGCCCCTAA GGTGTTCAGC AGGATGGCTG GTGGTTGTGC CTGGGCTTCA 3350 
TGTGGCTGTCT GGCAGTGGTC AAGGGGCCTC TGAGAAGCCT CCAACTCCTG 3400 
CCCAAGGGGCC TAGTCTGTGG GACAGTTCAT CTGGATGTGA ATCTATTTAT 3450 
CTTAAGTAGGA ACTGCCTCGA GCAGCTGGGA CCCAGCAGGC CTGAGCCACA 3500 
AATCTGCAGCG GACATCAGAG ATAGTCTGAA TGCTGCGAGG TATTTCTTTC 3550 
TTCGTAAGTTT AGTCAGCACA CTGGGAAAAG GTCACATAAG CCAGGAGCCT 3600 
CCTTGTCTCTG GACTCAAAAG TCTGAGGCCT TAAGTGAACA ACAGAAAGAG 3650 
GGTCCCTGCTG GCTACCAGGG ATAAGGGGAT GACCTGTGAC CCTTGAGCCA 3700 
GGGAGAGCAGG TAAGCTGGGT GGTGTCATCA CCTGGGGGCC TGGTGCTAGG 3750 
GCATCCATGCT GGGAGCCCCA GGAGACCAGG CTTTGTGTGG GAGCCTGGCA 3800 
TCATCGTGGCT GGGGCAGCCC CTGCTCAGGT GCTGTCTCTG CCCGTGACCT 3 850 
TGAAGCCACCC TCCCCCCGTA CAGTTTTCCA TTCTCCTGGC TACTAGTGTG 3900 
GCTGTTCATTG CCTACCTTGA TGAGTAGATT TCAGCCCTCC TAAAGCTGGG 3950 
GCCTTTCCTCG TGCC 3964 
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(2) INFORMATION FOR SEQ ID NO: 1 1: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 
{ii)MOLECULETYPE: protein 
(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: mouse 

(ix) FEATURE: Huntingtin-interacting protein -mHIPIa 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 1 : 

Met Asn Ser lie Lys Asn Val Pro Ala Arg Val Leu Ser Arg Arg 

5 10 15 

Pro Gly His Ser Leu Glu Ala Glu Arg Glu Gin Phe Asp Lys Thr 

20 25 30 

Gin Ala lie Ser lie Ser Lys Ala lie Asn Ser Gin Glu Ala Pro 

35 40 45 

Val Lys Glu Lys His Ala Arg Arg lie lie Leu Gly Thr His His 

50 55 60 

Glu Lys Gly Ala Phe Thr Phe Trp Ser Tyr Ala lie Gly Leu Pro 

65 70 75 

Leu Ser Ser Ser Ser lie Leu Ser Trp Lys Phe Cys His Val Leu 

80 85 90 

His Lys Val Leu Arg Asp Gly His Pro Asn Val Leu His Asp Tyr 

95 100 105 

Gin Arg Tyr Arg Ser Asn lie Arg Glu lie Gly Asp Leu Trp Gly 

110 115 120 

His Leu Arg Asp Gin Tyr Gly His Leu Val Asn lie Tyr Thr Lys 

125 130 135 

Leu Leu Leu Thr Lys lie Ser Phe His Leu Lys His Pro Gin Phe 

140 145 150 

Pro Ala Gly Leu Glu Val Thr Asp Glu Val Leu Glu Lys Ala Ala 

155 160 165 

Gly Thr Asp Val Asn Asn lie Phe Gin Leu Thr Val Glu Met Phe 

170 175 180 

Asp Tyr Met Asp Cys Glu Leu Lys Leu Ser Glu Ser Val Phe Arg 



26 



wo 99/60986 PCT/US99/1 1 743 

185 190 195 

Gin Leu Asn Thr Ala lie Ala Val Ser Gin Met Ser Ser Gly Gin 
200 205 210 

Cys Arg Leu Ala Pro Leu lie Gin Val lie Gin Asp Cys Ser His 
215 220 225 

Leu Tyr His Tyr Thr Val Lys Leu Met Phe Lys Leu His Ser Cys 
230 235 240 

Leu Pro Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe His Glu 
245 250 255 

Gin Phe His Ser Leu Lys Asn Phe Phe Arg Arg Ala Ser Asp Met 
260 265 270 

Leu Tyr Phe Lys Arg Leu lie Gin lie Pro Arg Leu Pro Glu Gly 
275 288 285 

Pro Pro Asn Phe Leu Arg Ala Ser Ala Leu Ala Glu His lie Lys 
290 295 300 

Pro Val Val Val lie Pro Glu Glu Ala Pro Glu Glu Glu Glu Pro 
305 310 315 

Glu Asn Leu lie Glu lie Ser Ser Ala Pro Pro Ala Gly Glu Pro 
320 325 330 

Val Val Val Ala Asp Leu Phe Asp Gin Thr Phe Gly Pro Pro Asn 
335 340 345 

Gly Ser Met Lys Asp Asp Arg Asp Leu Gin lie Glu Asn Leu Lys 
350 355 360 

Arg Glu Val Glu Thr Leu Arg Ala Glu Leu Glu Lys lie Lys Met 
365 370 375 

Glu Ala Gin Arg Tyr lie Ser Gin Leu Lys Gly Gin Val Asn Gly 
380 385 390 

Leu Glu Ala Glu Leu Glu Glu Gin Arg Lys Gin Lys Gin Lys Ala 
395 400 405 

Leu Val Asp Asn Glu Gin Leu Arg His Glu Leu Ala Gin Leu Lys 
410 415 420 

Ala Leu Gin Leu Glu Gly Ala Arg Asn Gin Gly Leu Arg Glu Glu 
425 430 435 

Ala Glu Arg Lys Ala Ser Ala Thr Glu Ala Arg Tyr Ser Lys Leu 
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440 445 450 

Lys Glu Lys His Ser Glu Leu He Asn Thr His Ala Glu Leu Leu 

455 450 465 

Arg Lys Asn Ala Asp Thr Ala Lys Gin Leu Thr Val Thr Gin Gin 

470 475 480 

Ser Gin Glu Glu Val Ala Arg Val Lys Glu Gin Leu Ala Phe Gin 

485 490 495 

Met Glu Gin Ala Lys Arg Glu Ser Glu Met Lys Met Glu Glu Gin 

500 505 510 

Ser Asp Gin Leu Glu Lys Leu Lys Arg Glu Leu Ala Ala Arg Ala 

515 520 525 

Gly Glu Leu Ala Arg Ala Gin Glu Ala Leu Ser Arg Thr Glu Gin 

530 535 540 

Ser Gly Ser Glu Leu Ser Ser Arg Leu Asp Thr Leu Asn Ala Glu 

545 550 555 

Lys Glu Ala Leu Ser Gly Val Val Arg Gin Arg Glu Ala Glu Leu 

560 565 570 

Leu Ala Ala Gin Ser Leu Val Arg Glu Lys Glu Glu Ala Leu Ser 

575 588 585 

Gin Glu Gin Gin Arg Ser Ser Gin Glu Lys Gly Glu Leu Arg Gly 

590 595 600 

Gin Leu Ala Glu Lys Glu Ser Gin Glu Gin Gly Leu Arg Gin Lys 

605 610 615 

Leu Leu Asp Glu Gin Leu Ala Val Leu Arg Ser Ala Ala Ala Glu 

620 625 630 

Ala Glu Ala He Leu Gin Asp Ala Val Ser Lys Leu Asp Asp Pro 

635 640 645 

Leu His Leu Arg Cys Thr Ser Ser Pro Asp Tyr Leu Val Ser Arg 

650 655 660 

Ala Gin Ala Ala Leu Asp Ser Val Ser Gly Leu Glu Gin Gly His 

665 670 675 

Thr Gin Tyr Leu Ala Ser Ser Glu Asp Ala Ser Ala Leu Val Ala 

680 685 690 

Ala Leu Thr Arg Phe Ser His Leu Ala Ala Asp Thr He Val Asn 
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695 700 705 

Gly Ala Ala Thr Ser His Leu Ala Pro Thr Asp Pro Ala Asp Arg 
710 715 720 

Leu Met Asp Thr Cys Arg Glu Cys Gly Ala Arg Ala Leu Glu Leu 
725 730 735 

Val Gly Gin Leu Gin Asp Gin Thr Val Leu Arg Arg Ala Gin Pro 
740 745 750 

Ser Leu Met Arg Ala Pro Leu Gin Gly lie Leu Gin Leu Gly Gin 
755 760 765 

Asp Leu Lys Pro Lys Ser Leu Asp Val Arg Gin Glu Glu Leu Gly 
770 775 780 

Ala Met Val Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu 
785 790 795 

Asp Ala Val Arg Arg lie Glu Asp Met Met Ser Gin Ala Arg His 
800 805 810 

Glu Ser Ser Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Asn 
815 820 825 

Ser Cys Thr Asp Leu Met Lys Ala lie Arg Leu Leu Val Met Thr 
830 835 840 

Ser Thr Ser Leu Gin Lys Glu lie Val Glu Ser Gly Arg Gly Ala 
845 850 855 

Ala Thr Gin Gin Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu 
860 865 870 

Gly Leu lie Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Gin 
875 888 885 

Leu Val Glu Ser Ala Asp Lys Val Val Leu His Met Gly Lys Tyr 
890 895 900 

Glu Glu Leu lie Val Cys Ser His Glu lie Ala Ala Ser Thr Ala 
905 910 915 

Gin Leu Val Ala Ala Ser Lys Val Lys Ala Asn Lys Asn Ser Pro 
920 925 930 

His Leu Ser Arg Leu Gin Glu Cys Ser Arg Thr Val Asn Glu Arg 
935 940 945 

Ala Ala Asn Val Val Ala Ser Thr Lys Ser Gly Gin Glu Gin lie 
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Glu Asp Arg Asp Thr Met Asp Phe Ser Gly Leu Ser Leu lie Lys 
965 970 975 

Leu Lys Lys Gin Glu Met Glu Thr Gin Val Arg Val Leu Glu Leu 
980 985 990 

Glu Lys Thr Leu Glu Ala Glu Arg Val Arg Leu Gly Glu Leu Arg 
995 1100 1105 

Lys Gin His Tyr Val Leu Ala Gly Gly Met Gly Thr Pro Ser Glu 
1110 1115 1120 

Glu Glu Pro Ser Arg Pro Ser Pro Ala Pro Arg Ser Gly Ala Thr 
1125 1130 1135 

Lys Lys Pro Pro Leu Ala Gin Lys Pro Ser lie Ala Pro Arg Thr 
1140 1145 1150 

Asp Asn Gin Leu Asp Lys Lys Asp Gly Val Tyr Pro Ala Gin Leu 
1155 1160 1165 



Val Asn Tyr 



(2) INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GAAGATACCC C ACCA AAC 1 8 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM; human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GCTTGACAGT GTAGTCATAA AGGTGGCTGC AGTCC 35 

(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
Oi)MOLECUUE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGACATGTCC AGGGAGTTGA ATAC 24 



(2) INFORMATION FOR SEQ ID NO:15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: yes 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM; human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CUACUACUAC UACUAGGCCA CGCGTCGACT AGTACGGGD GGGUGGOn G 41 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) UENGTH:516 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 



31 



wo 99/60986 



PCT/US99/n743 



(x) FEATURE: exon 1 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 16: 

TCTGTGGAAG GTTTGGAGGG GAGAGAGGGG CAGCTGGATG CTCTTGGGCC ACGGTCGCCC 60 

CTGATCTCTG CGCCTCTTCC TCCTGCTCCG GGAGAAATAA TGTTTCCCTG GGGGATGAAA 120 

GCATCTCTTT GTGCGGGCTT TAATTGCCAT GTTGTTGTGC CAAGGGAGTG AGTGGCGGCG 180 

GGACCAGCAG CTGGGCACAG CCAATGCCAG GCAGTGGTGC CCACTCCCTC AGGACGCCCA 240 

GCCAGCTGGC TCCTGGGAGC GCTGCCCACC TCTGCCCCCA GCTGGGCGCC TGCAAGGAAC 300 

CGACCACCCG TGGGGCTGGG GGAGGTTGGC TGGAGGAGGA GAAAGGGGCG GGCTCTGGGA 360 

GGGTCTCAGC CACTCTCAGA GGCTTATTCA TCTCATCCTC CTTTCCCTCC CCCTTCTTGT 420 

TTTTCAGACT GTCAGCATCA ATAAGGCCAT TAATACGCAG GAAGTGGCTG TAAAGGAAAA 480 

ACACGCCAGA AATATCCTTT GGATGTTGCT TGGAAG 516 



(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 2 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 17: 

TGTTTTCCAT AACCCCCCCT CACCGTGCAT ACTGGGCACC CACCATGAGA AAGGGGCACA 60 
GACCTTCTGG TCTGTTGTCA ACCGCCTGCC TCTGTCTAGC AACCCAGTGC TCTGCTGGAA 120 
GTTCTGCCAT GTGTTCCACA AACTCCTCCG AGATGGACAC CCGAACGTGA GTTCCTGGGG 180 
CTATGGGGTG GCA ^ 193 

(2) INFORMATION FOR SEQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 3 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GTGTTCTTTT GCCCCTGCAG GTCCTGAAGG ACTCTCTGAG ATACAGAAAT GAATTGAGTG 60 
ACATGAGCAG GATGTGGGTG AGTTTGGAGA TGTACTCAGG AGCC 104 

(2) INFORMATION FOR SEQ ID NO:20: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 327 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 4 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

AATTCCTGGC TGCAGATCTC TTGACTGTTA TGTTCTTGTT GTTGACTCTG TTTCCCCTCC 60 
TCTTCCTAAA AGGGCCACCT GAGCGAGGGG TATGGCCAGC TGTGCAGCAT CTACCTGAAA 120 
CTGCTAAGAA CCAAGATGGA GTACCACACC AAAGTGAGTC TCTGCGGACA GTTCTGCCGC 180 
CACCGCCGCC TCCCCTGCTC CATCCCTTCA GCCCCTCCCT GGGCTCATTT GTCAGCTCTT 240 
TCAGGTAATA GACAGCCCAG GCTTCTGAGG AAGTGTGCAC ATCATGTACC CAAGCTGTGA 300 
GAGAGGAAAG CCACCGCCAG GCCCACG 327 



(2) INFORMATION FOR SEQ ID NO:2 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 5 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

GGGCTCAAGC AATCCTCCCA CCTCGGCCTC CCAAGTAGCT GGGACCACAG GCGTGTGCCA 60 
CCACGCCCGG CTGAGAGAGG GCTCTTCATG TCTTCTGCCC TGACTCCCTT CCTCTGCCTC 120 
CCTTCCAGAA TCCCAGGTTC CCAGGCAACC TGCAGATGAG TGACCGCCAG CTGGACGAGG 180 
CTGGAGAAAG TGACGTGAAC AACTTGTAAG TGGCTCCTGC CCTGAGCCCA GGGAGGGAGA 240 
AAGCTTTTGT GAATGCTGAC ACTTCTCATA AGGGTCATGG AGGGCCTGAT GGGGGGAGGC 3 00 
CGTGGCTGGG ATGGGGACCA AAGCCCCTGG G 331 



(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE; 
(A) ORGAMSM: human 

(x) FEATURE: exon 6 of HLPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ACTGTCGCTG TCACTGTTGA CTTCACCAGG CTGCATGGCC ATAATACCCA CAAGGCTAAG 60 

ACTTGGAGCT GGAGTTGTGT GTGTGTTTGC GCATGCACAT GAGCATTGGA GACTGGAGTA 120 

GCGTAGAGCG TGGGGGAGGG GACAGGTAAC AGACCGGCCT CAGGCTGTGG AGTGTAAGCT 180 

CTCTTTCCTC TTGGGTCCAG TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAGTGT 240 

GAACTCAACC TCTTCCAAAC AGGTGAGTCT CTTCCCTCCC GTCTAACCCA GGCTCTCATG 300 

GGAACTACCT AATTCCTAGT CCTCCTCTCC CTGCAAAGTG TGCAGCACAA GGGGTAGGAA 360 

AATGGAGACA TTCACACCCC ATCTCTGGTC TCTCCAACCC TCGTGCAGGG AGGGACTGAA 420 

CCTCTTCAGT ATTTTTCTTT TTAAGAGACA AGGTCTCGGC CGGGTGCAGT 470 

(2) INFORMATION FOR SEQ ID NO:23: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 565 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY; linear 
{ii)MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: cxon 7 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

TCTTCACCTG TTTAATGGGG ATACGTTTAC CTATCTCATG GGAGTGTTGT GAAGGTTAAA 60 

TGAATTAGAT GAGGTAAAGC ACGCACAGAA TCGGTCCTTG GTGTATGTTG GACCCCTGCC 120 

TCTGCCCCTC TGAAGAGGCT GCCTGTAATC CCCTGGCTCT ACCACCTTTC TCCCTCACTT 180 

TTATTTCCTA GTATTCAACT CCCTGGACAT GTCCCGCTCT GTGTCCGTGA CGGCAGCAGG 240 

GCAGTGCCGC CTCGCCCCGC TGATCCAGGT CATCTTGGAC TGCAGCCACC TTTATGACTA 3 00 

CACTGTCAAG CTTCTCTTCA AACTCCACTC CTGTGAGTAC CGCGGGCCAG ATCTTCTTAC 3 60 

ATGAGATTCA GGCCAGAGGG AGGATCCCAG CCTGAGGATG TCCCCAGAGA AACGCAGTCC 420 

TTCTCAGTGC CTTTGGCTGT CTGCTTCTGT TCCAAAAGGC CCCGGAGCTT CTGACCATTG 480 

TGAGGATAAA AGAGCAGGGC CCAGGCTTTG GTGACCCCAG TAAAGCCCCT GGCTTGCCAC 540 

TCTTGCGTCC AGTGTTACAG GATCT 565 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTLSENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 8 of HIP I 
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(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GGGACAGCTC TAGGCCAGTC GTGGCCCCTG GCAGTGCTGG CCACATGCCC CAGGGTAGCT 60 

GGGCCCCTCC CCCTCGAGAG CCCCGCTGTG GCTTCCCTGC CCTCTGGTCC CCCTCCCCTC 120 

TCACACTCTT TCCAATTTCT TCCAGGCCTC CCAGCTGACA CCCTGCAAGG CCACCGGGAC 180 

CGCTTCATGG AGCAGTTTAC AAAGTAAGTG GTTCAAGTAA CAGGAATGGA GGT 233 

(2) INFORMATION FOR SEQ ID NO:25: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 578 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
P) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGEsTAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exons 9 and 10 of HIPl 
{xi)SEQUENCE DESCRIPTION: SEQ ED NO: 25: 

TGAATCCCAG CACCATGGAG TTTATCTCCT TGACAGCCTG TGCCTTTGGG CTGGGGAGGG 6 0 

GGCAGGAAAG CCAGGTGGCT GCTCTGTCCC CTACATGGGG CTGATGAAGA CACCCAGCAC 120 

CCCTCAGGTC CTTCTCCACC CCTAGGTTGA AAGATCTGTT CTACCGCTCC AGCAACCTGC 180 

AGTACTTCAA GCGGCTCATT CAGATCCCCC AGCTGCCTGA GGTAAGCATG CCCAACCACA 240 

CACCCTCGGC ACTGCAGAGG CCCCAGGTAC TCTCTTAAGG GCCGGCGGGG CCTGGCAAGC 3 00 

AAGCACTATT TGAGGATGTG TCTCCGTCTT CAGAACCCAC CCAACTTCCT GCGAGCCTCA 3 60 

GCCCTGTCAG AACATATCAG CCCTGTGGTG GTGATCCCTG CAGAGGCCTC ATCCCCCGAC 42 0 

AGCGAGCCAG TCCTAGAGAA GGATGACCTC ATGGACATGG ATGCCTCTCA GCAGGTGAGG 480 

ACCACTTGGG AGAGAAACTT GGCCTTTCCT CTCACCTGCA AGTACAGGGG AGAGGCTGGG 540 

GGAGACCCTG GCCAAAGCCC ATTGACTCTA ACCAGGTT 57 8 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 390 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 1 1 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

AAAAAAATTT AAAAAATTAA ACAGGTCTGA ACCGTTTAAT TCGAGAAAGG GGGCATTCTC 60 

CCATATCACT CAACTGACCC ACACACAGAA TTCTCTGGCT CTCTGACTTA TTCTCACTCC 120 

TTTTTGGTCA ACCACAGAAT TTATTTGACA ACAAGTTTGA TGACATCTTT GGCAGTTCAT 180 

TCAGCAGTGA TCCCTTCAAT TTCAACAGTC AAAATGGTGT GAACAAGGAT GAGAAGTGAG 240 

TCCAAGCTGG GTTCAAGCAG ATGGTTCAGG AGCTAAGTTA AGCCATGGTC TGCCTCAAAA 300 

CACTAACCAA AGAGGAATTC TTAATGATAC TGGGGCTTCT TAGATACAGA ACATCTTGAA 360 

GGGTTGGGGG CAATGGCTTA TGCCTGTAAT 390 
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(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 547 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(X) FEATURE: exon 12 of HIPl 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

AAAATCAATA ACCATGGATT TATGAGTATT AGATTAGTAT CTGGTAACAT TTAGAGTATA 60 
ATTTATGGCA TTTCAAAGAA TTGTCCCCAA ATTAATACCA GCTTTTAATT TCCTCCCCTG 120 
AGCTCACAAT TAAAAACAGA GGGATAGAAG CACTATGAAA GCAAACTCAT TCCCCTTCTC 180 
TTCCCAGGGA CCACTTAATT GAGCGACTAT ACAGAGAGAT CAGTGGATTG AAGGCACAGC 240 
TAGAAAACAT GAAGACTGAG GTATAACTTG GATCTGCTCT GCCTTTGCGC TTCACCAAAA 300 
CACGGTAGAT TTGAATGTTA AATTTGCATC ACACTAGCCA GGCACAGTGG CTCACACCTG 360 
TAATCCTAGC ACTTTGGGAG GCCAAGGCAG GAGGATTACC TGAGGTCGGG AGTTCGAGAC 420 
CAGCCTGGGC AACAGGGTGA AACCCCCGTC TTCAATAAAA ATGCAATAAT TAGCCGGGTG 480 
TGTTGGCAGG CACCTGTAAT CCCAGCTACT CGGGAAGCTG AGGCATGAGA ATTGCTTGAA 540 
CTTGGGA 547 



(2) INFORMATION FOR SEQ ID NO:28: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 436 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 13 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

CCCCCAGCCA CTCTAAAGAG GACCACAATT CCCCGGCCAT CATCCCCTGT TATTGTTGTT 60 
GATTGAGGGG CTCCTAATGA CCAGATGGTC CAACCCTCCT GGGACGTGGA GAGTTGACTT 120 
AGGGGAATCA GGTATTTACT TGGAAGCATG GTAGGACCCG CTTCTCCGGC CCATGCCCGT 180 
GACCCGTGGC AGTGGGCGGT TGGCCTCATG ACCGGAGTCC CCCCACAGAG CCAGCGGGTT 240 
GTGCTGCAGC TGAAGGGCCA CGTCAGCGAG CTGGAAGCAG ATCTGGCCGA GCAGCAGCAC 300 
CTGCGGCAGC AGGCGGCCGA CGACTGTGAA TTCCTGCGGG CAGAACTGGA CGAGCTCAGG 360 
AGGCAGCGGG AGGACACCGA GAAGGCTCAG CGGAGCCTGT CTGAGATAGA AAGTGAGCGG 42 0 
TGGGTGGGGG CGGGGG 43 6 

(2) INFORMATION FOR SEQ ID NO:29: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 469 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(X) FEATURE: exon 14 of HIPl 

(xi)SEQUENCE DESCRIPTION: SEQ ED NO: 29: 

GACTTGAGCC CAAGGAGGTC AAGGCTGCAG TGAACAGTGA TTGTGCCACT GCACCCCAGC 60 
CTGGGTGACA GAGCAAGACT GTCTCAAAAC AAAACAAGGA GGACCTTCTA GGGACCCTGG 120 
CTCATTGCAA GGAAGGCAAG GGTCCCTGCT AGGTTAGACT CCTCACCTTG GTCCTTTACA 180 
ATACAGGGAA AGCTCAAGCC AATGAACAGC GATATAGCAA GCTAAAGGAG AAGTACAGCG 240 
AGCTGGTTCA GAACCACGCT GACCTGCTGC GGAAGGTAAG ACCCTCAGCC CCTGTCACCA 300 
TCCTGCAGGC CCTGCACCTC TAGGGAGAGA GCGGCTCAGG CCTGTGGCTT CCCCGGGGCC 360 
AGCAACCCCT ACATTGATCT CTAAGGCATT GCCGTCATCT CGGGAACCAC ACCTTTTCAG 420 
GCTTCCTTGC CTCTGTGTCT TGGGCTGTGT CCTGGGTGCC AATCCCATG 469 



(2) INFORMATION FOR SEQ ID NO:30: 

(1) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 359 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 15 of HIPl 

(xt)SEQUENCE DESCRIPTION: SEQ ED NO: 30: 

GGGTAGGAAA GTGATTCCTG TGTCTGACTC TAGGGCACGC ACAGCCTGAG TATGATTGTC 60 
CTAGAAGGAG GATGTCCTCT AAGCCTGGGA TCTCCTGGTT CAAGACACTG TTCTTCTTTT 120 
GCAGAATGCA GAGGTGACCA AACAGGTGTC CATGGCCAGA CAAGCCCAGG TAGATTTGGA 180 
ACGAGAGAAA AAAGAGCTGG AGGATTCGTT GGAGCGCATC AGTGACCAGG GCCAGCGGAA 240 
GGTGAGTGGG ACGAGGAGCA CTCGGGAAAT GAGGGAGGGG GCTGTTGAGT TGGTGGCGGG 300 
GGCTTTGTGG CCTTCTGCTC CATGGGCAGT TCTGTGGGTC GGTTGGCATC ACACAGCAG 359 

(2) INFORMATION FOR SEQ ID N0:31: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 



37 



wo 99/60986 



PCTAJS99/11743 



(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 16 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 31: 

GTTGATCGCT TGGGACGTTT TTACATTTTT ATATTCTTTG TCACTGTCAC CCAGATCAGA 60 
GTCCCTCTGT TTTTCTTCTC TTTCAGACTC AAGAACAGCT GGAAGTTCTA GAGAGCTTGA 120 
AGCAGGAACT TGCCACAAGC CAACGGGAGC TTCAGGTTCT GCAAGGCAGC CTGGAAACTT 180 
CTGCCCAGGT AAATACCTCC TTTTTTTTT 209 



(2) INFORMATION FOR SEQ ID NO:32: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 485 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 17 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

CCCCCACTGC AATCAGTGTG TCCCCGGGAG GGAATCAGAG TGGCAGGTTA AAGAGCCATC 60 
ACCTTCCCAG TCCTTGCAAC CCGGTGGTGG GTTGGACCTC TGGGAAGTAG GGACTGTTTA 120 
ACTCAACCAG CGTCTCCCTC TTTCCTTGTG GTCACCTTTG CAGTCAGAAG CAAACTGGGC 180 
AGCCGAGTTC GCCGAGCTAG AGAAGGAGCG GGACAGCCTG GTGAGTGGCG CAGCTCATAG 240 
GGAGGAGGAA TTATCTGCTC TTCGGAAAGA ACTGCAGGAC ACTCAGCTCA AACTGGCCAG 300 
CACAGAGGGT CACGGACATG GACACGAGCG AGCACCTGTG AATTCCCACC GAGGGCCTCT 360 
GCGCATGCAC GGAGGCTGGG AGGACCCCGG GGCTGCTGAG AAGGGGTTTG GGGCCTTGGC 420 
CTGATTGTGC AGACATTCTG TAGGTGTAAT GCCAGCAGGC CCTGCATTGC CTGCAGAGTC 480 
CATGA 485 

(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 18 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

TTACTGGCTT GGACCTCATT GGCCATGACT TGAGCTAAGA TGCTAAGAGC CCCAGCCAGG 60 
TCATCCTGCT CAGGTTCATT ATGGAGTCTA GGGCAGACTC TCACCTCCCT GGACCATTTT 120 
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TAGAATCTAT GTGCCAGCTT GCCAAAGACC AACGAAAAAT GCTTCTGGTG GGGTCCAGGA 180 

AGGCTGCGGA GCAGGTGATA CAAGACGCCC TGAACCAGCT TGAAGAACCT CCTCTCATCA 240 

GCTGCGCTGG GTCTGCAGGT ACACTTGCAA TTGCCCAGCT GGCAGGGGCC AGGTCCTTAC 300 

AGCCTGAGAC TCTGTTGATG TTGAATCTCA TGTGAGACTT AGCTCAGGGG CTCTCAGCCC 360 

AGCAGCATGT CAGCATTACC TTAGGGGCGC CCAGGCCCCA TCCTAGATCA GTTACATGTG 420 

GAAACTCTGT GCATTAGTGC CTATACACTA GTATTTTAGT ATTTTCTT 458 



(2) INFORMATION FOR SEQ ID NO:34: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 393 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 19 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

CACTAGTAAG CTCCTCCATT CAGTGCTTAA TTAACGAGGA TGAAGCCAGC TATGAGAACT 60 
TGCTCTGACC TTGCCCTGTG TTCCCTCTCA CAGATCACCT CCTCTCCACG GTCACATCCA 120 
TTTCCAGCTG CATCGAGCAA CTGGAGAAAA GCTGGAGCCA GTATCTGGCC TGCCCAGAAG 180 
GTAAGAATGG CCAAGGACAG TCTCTGTCGG CTAGTGATGG CCAGACAGGG TTCAGAAGCA 240 
CCTGAATGCG GGGATAGTGA CAGGTCCCTC TGCATCAAGA AAGGCATGTA GGCAACTCAT 300 
ACAAGAAAGG CATGTAGGCA ACTCATAAAA CGGGAGGAGA GGGTATGAAA GTGTCACCAT 360 
CAACCAGACC TGAGAAACTT CTCTTTCCAA TCC 393 

(2) INFORMATION FOR SEQ ID NO:35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECUUETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 20 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GGCCTGCCCA GAAGGTAAGA ATGGCCAAGG ACAGTCTCTG TCGGCTAGTG ATGGCCAGAC 60 

AGGGTTCAGA AGCACCTGAA TGCGGGGATA GTGACAGGTC CCTCTGCATC AAGAAAGGCA 120 

TGTAGGCAAC TCATACAAGA AAGGCATGTA GGCAACTCAT AAAACGGGAG GAGAGGGTAT 180 

GAAAGTGTCA CCATCAACCA GACCTGAGAA ACTTCTCTTT CCAATCCTGG CAGACATCAG 240 

TGGACTTCTC CATTCCATAA CCCTGCTGGC CCACTTGACC AGCGACGCCA TTGCTCATGG 300 

TGCCACCACC TGCCTCAGAG CCCCACCTGA GCCTGCCGAC TGTGAGTACT GGGGCATGAG 360 

GGGCTGTTCA TGGACCAGGG GAGCAGGGGG CCTTTAAAAG TCTCTGTTGG GCCGGGCGCA 420 

G 421 
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(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 498 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 21 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



AGGCCGAGGC AGGAGAATCG CTTGAACTCA GGAGGCGGAG TTTGCAGTGA GCCGAGATGG 60 

CGCCACTGCA CTCCAGCCTG GGCAACAAGA GCGAGACTCC ATCTCAAAAA AAAAGTGTCT 120 

ATTGCCTTGT ATCTCCAGCA CTGACCGAGG CCTGTAAGCA GTATGGCAGG GAAACCCTCG 180 

CCTACCTGGC CTCCCTGGAG GAAGAGGGAA GCCTTGAGAA TGCCGACAGC ACAGCCATGA 240 

GGAACTGCCT GAGCAAGATC AAGGCCATCG GCGAGGTACT TGGAGTAGTA TCATTGAGGA 300 

GCATTGTTAT TCTTCTGGGT GTGCGTGCTG GTGAATGGCC AGGGAATCGG TGATGTTCTG 360 

AGCTAGTTCT TTCTGCACTT AGAACTTGAT TCTAGAAAGA GATTGTTAAA ATTGGAAAAT 420 

CTGGCCGGGT GCAGTGATTT ATGCGTGTAA TCCCAGCACT TTGGGAGGCC GAGTCAGGAG 480 

GATCACTTGA GGCTAGAC 498 



(2) INFORMATION FOR SEQ ID NO:37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 22 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



CCCTGTGGCT TGCAGAAGGT GTTTGCTGGG TGGCCTCCTG CCTTGCCATC TTGTAAGGGT 60 

TACAGATGGC AGAGGAGAAG AGACAGGAGG CCCCAAGGTC AGTTCAGCCT TTGTGATGTG 120 

TTCACAGGAG CTCCTGCCCA GGGGACTGGA CATCAAGCAG GAGGAGCTGG GGGACCTGGT 180 

GGACAAGGAG ATGGCGGCCA CTTCAGCTGC TATTGAAACT GCCACGGCCA GAATAGAGGT 240 

AGGAGGTTCC TGCAGGATCT CCTGAAACGA TGCCTTTGCA GCTGCCCTTC TGCAACACTG 300 

CTCATTAAAC ATGTCACAGT CGTTCATTAA GGCCATGGCA ACCCCCTAAG ACAGAAACCA 360 

GAATTTGCCA GGCACAGTGG CTCATGCCTG TAACCCCAGC ACCTTGGGAG GATCACTTGA 420 

GTCCAGG 427 



(2) INFORMATION FOR SEQ ID NO:38: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 367 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 23 of HIPI 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



CCCCCTGAAT AGGTTAGAGT CTGGATTCTT TTCTGACTCT CTCAAGAATG TGGGCAGGGA 60 

CTTGGGGACT TCCAGATTCA GGTTTCCCAG CTACCACACG ATGTTGGACT GAAAGTATAG 120 

TAAGACATTA GTGGATCCTT AATATTCAAG GCACATTTAG AAACCATGCT TCTTTTTCAC 180 

AGGAGATGCT CAGCAAATCC CGAGCAGGAG ACACAGGAGT CAAATTGGAG GTGAATGAAA 240 

GGTCGGTCTG AGCGGCATGG TGGGACCTAG GGGAGCAGGA TCTGTCTTCC TGACATTGGT 300 

CTATACTTTG CATACTTATT AGGGAATTAG AGGAGAGCAG TAGCAGCCAC GGGGAAGGGC 360 

TGAGTTG 3 67 



(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 502 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 24 of HIPI 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



CCCCGCAGAA TGTTCCAGCA ACCTCAGCAC CCTTCTTACC TCCCTTTCCC ATTCCAAGCT 60 

TGCCTTTGGC TAGGAGTGGG GAAGAGAACC GTCGTGTTCA TTGATCTTGG ATCTTGATCT 120 

CAGTGTATCC TCGACTTGTT TGTTTGGCAG GATCCTTGGT TGCTGTACCA GCCTCATGCA 180 

AGCTATTCAG GTGCTCATCG TGGCCTCTAA GGACCTCCAG AGAGAGATTG TGGAGAGCGG 240 

CAGGGTGAGC GTGGGTGTGG GCCCTGGGCA GGAAGAGGAG GCATCGGTGA CAGACTCCCG 300 

CTCCAACGGA CTCTGTGATG CTGCCGTCTT ACTCTGTGTG TCCACCTGAG TACAGAGCAG 360 

CCACTCCTGT AGATATCAGC AGAGGCCCTG GGGAGAAGTC AGAGCTCCAG GACCTCCCCA 420 

GAGGGTGGCC AGGCATGTGT CCCAACTCCA GCTCCCTTCG CACAGGCAGA CATTGTTGGA 480 

ACTTGCTGTG GGAGCCCTTT TT 502 



(2) INFORMATION FOR SEQ ID NO:40: 

(i) SEQUENCE CHARACimiSTICS: 

(A) LENGTH: 437 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
<A) ORGANISM: human 

(x) FEATURE: exon 25 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TTTTGGTCTC TGAATCTTCT TCTTTTTTGT AAAATGGGAA TACTAATGCT TATGTCTCAG 60 

AGTTACTATG AGGATGATTT GGGATAATAT ATGTATAAAA GCACCTGCCA TATAGTACAT 120 

GCTCAATAAA AGGTGGCTAT TACTATTTTT TATTTCCCTA GGGTACAGCA TCCCCTAAAG 180 

AGTTTTATGC CAAGAACTCT CGATGGACAG AAGGACTTAT CTCAGCCTCC AAGGCTGTGG 240 

GCTGGGGAGC CACTGTCATG GTGTAAGTAT CTATTGGTAC CAAGGGTCCT CCCATGACCC 300 

CTCTTCCATT GATCCACTCC AAACAATAGC TAAGGAGGGA AAAAAAAATC TGTCCCTTAG 360 

AAATAAACTA TTGATCAGGA AGTCAATAGG ACCGAGTTTA CAAGGGAGCC TGGCTCTCCC 420 

AGGGGACACA GGGCAGG 437 



(2) INFORMATION FOR SEQ ID NO:4 1 : 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 26 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 1 : 

GGGAGCCTGG CTCTCCCAGG GGACACAGGG CAGGCAGCCT CCCCTCCCTG TTTAGCCAAG 60 
GGCGATGGGG TGGTCTGGAG GTGGGATTGT GGAGGAGTTG CAGCTCATTT GCCCGTAACC 120 
TAGTCCCTCT TGTCGTTTTC CATCAGGGAT GCAGCTGATC TGGTGGTACA AGGCAGAGGG 180 
AAATTTGAGG AGCTAATGGT GTGTTCTCAT GAAATTGCTG CTAGCACAGC CCAGCTTGTG 240 
GCTGCATCCA AGGTAGGACC TGGCTGGACC TCCTAGGACG CTGGAAGGCC TGGTTAGAGA 300 
GTACTAGGCT AGGTTAAAGA GTACTTGGCT GCGTTAGGCA GTACTTGGCT G 351 

(2) INFORMATION FOR SEQ ID NO:42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: Hnear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHHnCAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 27 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

CTTTTTATAT GATAGATATG TCAGGAGCTG ACTATAGTCA 
TGGTGATTGC CGTTTGGCCC ACATATGTTT GCTAAGAACC 
CAGTCCTTGT TGCTCTAGGT GTTGTATGAA CCTAAATCTG 



GCAGATTTTG AGAAGCTGAT 60 
ATCAGAGCAA TTATCTGATT 120 
CTTTGTCCTG GTAGGTGAAA ISO 
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GCTGATAAGG ACAGCCCCAA CCTAGCCCAG CTGCAGCAGG CCTCTCGGGG AGTGAACCAG 240 

GCCACTGCCG GCGTTGTGGC CTCAACCATT TCCGGCAAAT CACAGATCGA AGAGACAGGT 3 00 

AGCCTTTCCA AAGGGACCCT TTTCTTACCC ACCCTGTTGA GCTCTTCTCT GCATCCTTCC 3 60 

CTGTGATCCC AACCAAATCC CACAGGACTG TGTCTAAATT CTTTCATATT TTTCATCT 418 



(2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: Unear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOtJRCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 28 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



TTTCCACAGA GCATTGGCAT TGGCTGCCTC TCAGGTGCCA GTCAGCCAGG GTAGAATTTG 60 

ATGAGACCTT CTTGTTTCCA TCCTTGCAGA CAACATGGAC TTCTCAAGCA TGACGCTGAC 120 

ACAGATCAAA CGCCAAGAGA TGGATTCTCA GGTTAGGGTG CTAGAGCTAG AAAATGAATT 180 

GCAGAAGGAG CGTCAAAAAC TGGGAGAGCT TCGGAAAAAG CACTACGAGC TTGCTGGTGT 240 

TGCTGAGGGC TGGGAAGAAG GTAAGCTGAC TCAAAGGAT 27 9 



(2) INFORMATION FOR SEQ ID NO:44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3715 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 
(ill) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 29 and partial cds of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



AACATAAATT ATCATTGTCT TTTAGGAACA GAGGCATCTC CACCTACACT GCAAGAAGTG 60 

GTAACCGAAA AAGAATAGAG CCAAACCAAC ACCCCATATG TCAGTGTAAA TCCTTGTTAC 120 

CTATCTCGTG TGTGTTATTT CCCCAGCCAC AGGCCAAATC CTTGGAGTCC CAGGGGCAGC 180 

CACACCACTG CCATTACCCA GTGCCGAGGA CATGCATGAC ACTTCCCAAA GACTCCCTCC 240 

ATAGCGACAC CCTTTCTGTT TGGACCCATG GTCATCTCTG TTCTTTTCCC GCCTCCCTAG 300 

TTAGCATCCA GGCTGGCCAG TGCTGCCCAT GAGCAAGCCT AGGTACGAAG AGGGGTGGTG 360 

GGGGGCAGGG CCACTCAACA GAGAGGACCA ACATCCAGTC CTGCTGACTA TTTGACCCCC 420 

ACAACAATGG GTATCCTTAA TAGAGGAGCT GCTTGTTGTT TGTTGACAGC TTGGAAAGGG 480 

AAGATCTTAT GCCTTTTCTT TTCTGTTTTC TTCTCAGTCT TTTCAGTTTC ATCATTTGCA 540 

CAAACTTGTG AGCATCAGAG GGCTGATGGA TTCCAAACCA GGACACTACC CTGAGATCTG 600 

CACAGTCAGA AGGACGGCAG GAGTGTCCTG GCTGTGAATG CCAAAGCCAT TCTCCCCCTC 660 

TTTGGGCAGT GCCATGGATT TCCACTGCTT CTTATGGTGG TTGGTTGGGT TTTTTGGTTT 720 

TGTTTTTTTT TTTTAAGTTT CACTCACATA GCCAACTCTC CCAAAGGGCA CACCCCTGGG 780 

GCTGAGTCTC CAGGGCCCCC CAACTGTGGT AGCTCCAGCG ATGGTGCTGC CCAGGCCTCT 840 
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CGGTGCTCCA TCTCCGCCTC CACACTGACC AAGTGCTGGC CCACCCAGTC CATGCTCCAG 900 

GGTCAGGCGG AGCTGCTGAG TGACAGCTTT CCTCAAAAAG CAGAAGGAGA GTGAGTGCCT 960 

TTCCCTCCTA AAGCTGAATC CCGGCGGAAA GCCTCTGTCC GCCTTTACAA GGGAGAAGAC 1020 

AACAGAAAGA GGGACAAGAG GGTTCACACA GCCCAGTTCC CGTGACGAGG CTCAAAAACT 1080 

TGATCACATG CTTGAATGGA GCTGGTGAGA TCAACAACAC TACTTCCCTG CCGGAATGAA 1140 

CTGTCCGTGA ATGGTCTCTG TCAAGCGGGC CGTCTCCCTT GGCCCAGAGA CGGAGTGTGG 1200 

GAGTGATTCC CAACTCCTTT CTGCAGACGT CTGCCTTGGC ATCCTCTTGA ATAGGAAGAT 1260 

CGTTCCACTT TCTACGCAAT TGACAAACCC GGAAGATCAG ATGCAATTGC TCCCATCAGG 1320 

GAAGAACCCT ATACTTGGTT TGCTACCCTT AGTATTTATT ACTAACCTCC CTTAAGCAGC 1380 

AACAGCCTAC AAAGAGATGC TTGGAGCAAT CAGAACTTCA GGTGTGACTC TAGCAAAGCT 1440 

CATCTTTCTG CCCGGCTACA TCAGCCTTCA AGAATCAGAA GAAAGCCAAG GTGCTGGACT 1500 

GTTACTGACT TGGATCCCAA AGCAAGGAGA TCATTTGGAG CTCTTGGGTC AGAGAAAATG 1560 

AGAAAGGACA GAGCCAGCGG CTCCAACTCC TTTCAGCCAC ATGCCCCAGG CTCTCGCTGC 1620 

CCTGTGGACA GGATGAGGAC AGAGGGCACA TGAACAGCTT GCCAGGGATG GGCAGCCCAA 1680 

CAGCACTTTT CCTCTTCTAG ATGGACCCCA GCATTTAAGT GACCTTCTGA TCTTGGGAAA 1740 

ACAGCGTCTT CCTTCTTTAT CTATAGCAAC TCATTGGTGG TAGCCATCAA GCACTTCCCA 1800 

GGATCTGCTC CAACAGAATA TTGCTAGGTT TTGCTACATG ACGGGTTGTG AGACTTCTGT 1860 

TTGATCACTG TGAACCAACC CCCATCTCCC TAGCCCACCC CCCTCCCCAA CTCCCTCTCT 1920 

GTGCATTTTC TAAGTGGGAC ATTCAAAAAA CTCTCTCCCA GGACCTCGGA TGACCATACT 1980 

CAGACGTGTG ACCTCCATAC TGGGTTAAGG AAGTATCAGC ACTAGAAATT GGGCAGTCTT 2040 

AATGTTGAAT GCTGCTTTCT GCTTAGTATT TTTTTGATTC AAGGCTCAGA AGGAATGGTG 2100 

CGTGGCTTCC CTGTCCCAGT TGTGGCAACT AAACCAATCG GTGTGTTCTT GATGCGGGTC 2160 

AACATTTCCA AAAGTGGCTA GTCCTCACTT CTAGATCTCA GCCATTCTAA CTCATATGTT 2220 

CCCAATTACC AAGGGGTGGC CGGGCACAGT GGCTCACGCC TGTAATCCCA GCACTTTGAG 2280 

AGGCTGAGGT GGTAGGATCA CCTGAGGTCA GGAGTTCAAG ACCAGCCTGT CCAACATGGT 2340 

GAAACCCCCA TCTCTACTAA AAATACCAAA AATTAGCCGA GCGTAGTGAC GGGTGCCCGT 2400 

AATCCCAGCT ACTCAGGAGG CTGAGACAGG AGAATCACCT GAACCCCAGA GGCAGAGGTT 2460 

GCAGTGAGCT GAGATCACGC CATTGTACTC CAGCCTGGGC AACAAGAGCA AAACTCCGTC 2520 

TCAAAAAAAA AAAAAAATTA CAAATGGGGC AAACAGTCTA GTGTAATGGA TCAAATTAAG 2580 

ATTCTCTGCC CAGCCGGGCA CAGTGGCGCA TGCCTGTAAT CCCAGAACTT TGGGAGGCCA 2640 

AGACGGGATG ATTGCTTGAG CTCAGGAGTT TGAGACCAGG CTGGGCATCA TAGCAAGACC 2700 

TCATCTCTAC TAAAATTCAA AAACAAAATT AGCCGGGCAT GATGGTGCAT GCCTGTAGTC 2760 

TCAGCTAGTT GGGGAGCTAA GGTGGGAGAA TTGCTTGAGC TTGGGAAGTC GAGGCTGCAG 2820 

TCAGCCCTGA TTGTGCCAGT GCACTCCGGC CTGGGTGACA GAGTGAGACC CGTGCTCAAA 2880 

AAAAAAAAGA TTCTGTGTCA GAGCCCAGCC CAGGAGTTTG AGGCTGCAAT GAGCCATGAT 2940 

TTCCCACTGC ACTCCAGCCT GAGTGACAGA GCGAGACTCC ATCTCTTTAA AAACAAACAA 3000 

AAAATTATCT GAATGATCCT GTCTCTAAAA AGAAGCCACA GAAATGTTTA AAAACTTCAT 3060 

CGACTTAG CC TGAGTCATAA CGGTTAAGAA AGCACTTAAA CAGAAGCAGA GGCTAATTCA 3120 

GTGTCACATG AGGAAGTAGC TGTCAGATGT CACATAATTA CTTTCGTAAT AGCTCAGATT 3180 

AGAATGGCTA CCCCATTCTC TAGACAAAAT CAAATTGTCC TATTGTGACT CTTCTAAAAA 3240 

TGAAGATGAA GAGCTATTTA ATGACACACC TTGGATTAAA ACGGGAATCA CATCTTAAAG 3300 

CTAAAAATGA ACCTGCAAGC CTTCTAAATG AGTCACTGAG CATCACTAGT GACAAGTCTC 3360 

GGGTGAGCGT AAATGGGTCA TGACAAGATG GGACAGCAAC AAAATCATGG CTTAGGATCG 3420 

ACAAGAAGTT AAAAAACAGC TGCATCTGTT ACTTAAGTTT GTAAGACAGT GCCCTGAGAC 3480 

CTCTAGAGAA AAGATGTTTG TTTACATAAG AGAAAGAAGG CCAGACATGG TGTCTCACAC 3540 

GTTTAATCCC AGCACTTTGG GAGGCAGGGG CGGGTGGATC ACCTGAGGTC AGGAGTTCAA 3600 

GACTAGCCTG GCCAACATGG TGAAACCCCG TCTCTACTAA AAATACAAAA ATTAGCCGGG 3660 

CATGGTGGCA GGCGCCTATA ATCCCAGCTA CTGGGGAGGC TGAGGCAGGA GAATC 3715 
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